Machine Learning - Deep Learning

Confusion Matrix

Confusion Matrix

· A much better way to evaluate the performance of a classifier is to look at the confusion matrix.

· The general idea is to

· count the number of times instances of class A are classified as class B.

· For example,

ü to know the number of times the classifier confused images of 5s with 3s,

ü you would look in

· the fifth row and

· third column of the confusion matrix.

ü To compute the confusion matrix,

• you first need to have a set of predictions so that they can be compared to the actual targets.

• You could make predictions on the test set.

• Remember that you want to use the test set only at the very end of your project, once you have a classifier that you are ready to launch.

ü Instead, you can use the cross_val_predict() function:

from sklearn.model_selection import cross_val_predict

y_train_pred = cross_val_predict(sgd_clf, X_train, y_train_5, cv=3)

· Just like the cross_val_score() function,

· cross_val_predict() performs

ü K-fold cross-validation, but instead of returning the evaluation scores,

§ it returns the predictions made on each test fold.

· This means that you get a clean prediction for each instance in the training set.

· “clean” meaning that the prediction is made by a model that never saw the data during training.

ü Now you are ready to get the confusion matrix using the confusion_matrix() function.

ü Just pass it the target classes (y_train_5) and the predicted classes (y_train_pred):

from sklearn.metrics import confusion_matrix

confusion_matrix(y_train_5, y_train_pred)

array([[53057, 1522],

[ 1325, 4096]])

ü Each row in a confusion matrix represents an actual class, while each column represents a predicted class.

ü The first row of this matrix considers non-5 images (the negative class):

• 53,057 of them were correctly classified as non-5s (they are called true negatives),

• while the remaining 1,522 were wrongly classified as 5s (false positives).

• The second row considers the images of 5s (the positive class):

ü 1,325 were wrongly classified as non-5s (false negatives),

ü while the remaining 4,096 were correctly classified as 5s (true positives).

ü A perfect classifier would have only

• true positives and

• true negatives,

ü so its confusion matrix would have nonzero values only on its main diagonal (top left to bottom right):

y_train_perfect_predictions = y_train_5 # pretend we reached perfection

confusion_matrix(y_train_5, y_train_perfect_predictions)

array([[54579, 0],

[ 0, 5421]])

· The confusion matrix gives you a lot of information, but sometimes you may prefer a more concise metric.

· The accuracy of the positive predictions; this is called the precision of the classifier.

• Precision

ü TP is the number of true positives

ü FP is the number of false positives.

· A trivial way to have perfect precision is to make one single positive prediction and ensure it is correct (precision = 1/1 = 100%).

· But this would not be very useful, since the classifier would ignore all but one positive instance.

· So precision is typically used along with another metric named recall, also called

ü sensitivity or

ü the true positive rate (TPR):

· this is the ratio of positive instances that are correctly detected by the classifier.

Recall

•

ü FN is the number of false negatives.

• Confusion matrix is explained in Figure 2.

Figure 2. An illustrated confusion matrix shows examples of true negatives (top left), false positives (top right), false negatives (lower left), and true positives (lower right)

Measuring Accuracy Using Cross-Validation

Measuring Accuracy Using Cross-Validation

• A good way to evaluate a model is to use cross-validation.

• Let’s use the cross_val_score() function to

ü evaluate our SGDClassifier model,

· using K-fold cross-validation with three folds.

• Remember that K-fold cross-validation means

ü splitting the training set into K folds (in this case, three), then

· making predictions and

· evaluating them on each fold using

ü a model trained on the remaining folds.

from sklearn.model_selection import cross_val_score

cross_val_score(sgd_clf, X_train, y_train_5, cv=3, scoring="accuracy")

array([0.96355, 0.93795, 0.95615])

ü Above 93% accuracy (ratio of correct predictions) on all cross-validation folds?

ü This looks amazing, doesn’t it?

ü let’s look at a very dumb classifier that just classifies every single image in the “not-5” class:

from sklearn.base import BaseEstimator

class Never5Classifier(BaseEstimator):

def fit(self, X, y=None):

return self

def predict(self, X):

return np.zeros((len(X), 1), dtype=bool)

ü Can you guess this model’s accuracy?

ü Let’s find out:

never_5_clf = Never5Classifier()

cross_val_score(never_5_clf, X_train, y_train_5, cv=3, scoring="accuracy")

array([0.91125, 0.90855, 0.90915])

· It has over 90% accuracy!

· This is simply because

ü only about 10% of the images are 5s,

ü so if you always guess that an image is not a 5,

• you will be right about 90% of the time.

• This demonstrates why accuracy is

• generally not the preferred performance measure for classifiers,

ü especially when you are dealing with skewed datasets

ü i.e., when some classes are much more frequent than others.

• Implementing Cross-Validation

ü Occasionally you will need more control over the cross-validation process than what Scikit-Learn provides off the shelf.

ü In these cases, you can implement cross-validation yourself.

Performance Measures

Performance Measures

• Evaluating a classifier is often significantly trickier than evaluating a regressor.

• There are many performance measures available.

i. Confusion Matrix

ii. True Positive Rate

iii. True Negative Rate

iv. False Positive Rate

v. False Negative Rate

vi. Precision

vii. Recall

viii. Accuracy

ix. F1-Score

x. Specificity

xi. Receiver Operating Characteristic (ROC)

xii. Area Under Curve (AUC)

YouTube Link: https://youtu.be/jL39fMC_I28

Machine Learning - Deep Learning

Confusion Matrix

Measuring Accuracy Using Cross-Validation

Performance Measures

About Machine Learning

SOFTWARE ENGINEERING