Machine Learning - Deep Learning: F1 score

Showing posts with label F1 score. Show all posts

Machine Learning 3 UNIT-2 (A) PPTs

Precision and Recall

Precision and Recall

Scikit-Learn provides several functions to compute classifier metrics, including precision and recall:

from sklearn.metrics import precision_score, recall_score

precision_score(y_train_5, y_train_pred) # == 4096 / (4096 + 1522)

0.7290850836596654

recall_score(y_train_5, y_train_pred) # == 4096 / (4096 + 1325)

0.7555801512636044

· Now your 5-detector does not look as shiny as it did when you looked at its accuracy.

· When it claims an image represents a 5, it is correct only 72.9% of the time.

· Moreover, it only detects 75.6% of the 5s.

• It is often convenient to combine

ü precision and

ü recall into a single metric called the F1 score,

ü in particular if you need a simple way to compare two classifiers.

• The F1 score is the harmonic mean of precision and recall.

F1Score

• Whereas the regular mean treats all values equally,

ü the harmonic mean gives much more weight to low values.

• As a result, the classifier will only get a high F1 score if

ü both recall and precision are high.

• To compute the F1 score, simply call the f1_score() function:

from sklearn.metrics import f1_score

f1_score(y_train_5, y_train_pred)

0.7420962043663375

· The F1 score favors classifiers that have similar

ü precision and

ü recall.

· This is not always what you want:

ü in some contexts you mostly care about precision, and

ü in other contexts you really care about recall.

· For example, if you trained a classifier to detect videos that are safe for kids,

ü you would probably prefer a classifier that rejects many good videos (low recall)

ü but keeps only safe ones (high precision),

ü rather than a classifier that has a much higher recall but lets a few really bad videos show up in your product.

· On the contrary, however suppose

· you train a classifier to detect shoplifters in surveillance images:

ü it is probably fine if your classifier has only 30% precision as long as it has 99% recall

ü the security guards will get a few false alerts,

ü but almost all shoplifters will get caught.

· Unfortunately, you can’t have it both ways:

i. increasing precision reduces recall, and

ii. vice versa.

ü This is called the precision/recall trade-off.

Confusion Matrix

Confusion Matrix

· A much better way to evaluate the performance of a classifier is to look at the confusion matrix.

· The general idea is to

· count the number of times instances of class A are classified as class B.

· For example,

ü to know the number of times the classifier confused images of 5s with 3s,

ü you would look in

· the fifth row and

· third column of the confusion matrix.

ü To compute the confusion matrix,

• you first need to have a set of predictions so that they can be compared to the actual targets.

• You could make predictions on the test set.

• Remember that you want to use the test set only at the very end of your project, once you have a classifier that you are ready to launch.

ü Instead, you can use the cross_val_predict() function:

from sklearn.model_selection import cross_val_predict

y_train_pred = cross_val_predict(sgd_clf, X_train, y_train_5, cv=3)

· Just like the cross_val_score() function,

· cross_val_predict() performs

ü K-fold cross-validation, but instead of returning the evaluation scores,

§ it returns the predictions made on each test fold.

· This means that you get a clean prediction for each instance in the training set.

· “clean” meaning that the prediction is made by a model that never saw the data during training.

ü Now you are ready to get the confusion matrix using the confusion_matrix() function.

ü Just pass it the target classes (y_train_5) and the predicted classes (y_train_pred):

from sklearn.metrics import confusion_matrix

confusion_matrix(y_train_5, y_train_pred)

array([[53057, 1522],

[ 1325, 4096]])

ü Each row in a confusion matrix represents an actual class, while each column represents a predicted class.

ü The first row of this matrix considers non-5 images (the negative class):

• 53,057 of them were correctly classified as non-5s (they are called true negatives),

• while the remaining 1,522 were wrongly classified as 5s (false positives).

• The second row considers the images of 5s (the positive class):

ü 1,325 were wrongly classified as non-5s (false negatives),

ü while the remaining 4,096 were correctly classified as 5s (true positives).

ü A perfect classifier would have only

• true positives and

• true negatives,

ü so its confusion matrix would have nonzero values only on its main diagonal (top left to bottom right):

y_train_perfect_predictions = y_train_5 # pretend we reached perfection

confusion_matrix(y_train_5, y_train_perfect_predictions)

array([[54579, 0],

[ 0, 5421]])

· The confusion matrix gives you a lot of information, but sometimes you may prefer a more concise metric.

· The accuracy of the positive predictions; this is called the precision of the classifier.

• Precision

ü TP is the number of true positives

ü FP is the number of false positives.

· A trivial way to have perfect precision is to make one single positive prediction and ensure it is correct (precision = 1/1 = 100%).

· But this would not be very useful, since the classifier would ignore all but one positive instance.

· So precision is typically used along with another metric named recall, also called

ü sensitivity or

ü the true positive rate (TPR):

· this is the ratio of positive instances that are correctly detected by the classifier.

Recall

•

ü FN is the number of false negatives.

• Confusion matrix is explained in Figure 2.

Figure 2. An illustrated confusion matrix shows examples of true negatives (top left), false positives (top right), false negatives (lower left), and true positives (lower right)

Machine Learning - Classification MCQs

Machine Learning - Classification - MCQs

1. By default, SGD classifier follows this strategy for multi-class classification:

Top of Form

A) OvO strategy

B) OvA strategy

C) Both

D) None

Ans: B

2. SGD Classifiers and Linear Classifiers are strictly

A) Binary Classifier

B) Multiclass classifier

C) Both

D) None

Ans: A

3. The greater the value for ROC AUC, better the model:Top of Form

A) True

B) FalseBottom of Form

Ans: A

4. The maximum value of the ROC AUC isTop of Form

A) 0.8

B) 0.9

C) 1

D) 0.7Bottom of Form

Ans: C

5. Recall can be increased by increasing the decision threshold. True or FalseTop of Form?

A) False

B) True

Ans: A

6. Precision can be increased by increasing the decision threshold. True or False?

A) True

B) False

Ans: A

7. Which of these is a good measure to decide which threshold to use?

A) Confusion matrix

B) F1 score

C) ROC curve

D) Precision & Recall versus Threshold Curve

Ans: D

8. SVM Classifier scales poorly with the size of training dataset. For SVM, which strategy for multi-class classifier should be applied?

Top of Form

A) OvA

B) OvO

Ans: B

9. Is RandomForestClassifier a multinomial classifier?

A) Yes

B) NoBottom of Form

Ans: A

10. For RandomForestClassifier, do we need to run either OvA or OvO classifier at all?

A) Yes

B) No

Ans: BBottom of Form

11. Which of these classifiers support multilabel classification?

A) SGD Classifier

B) SVM Classifier

C) KNeighbours Classifier

D) None

Ans: C

12. For MNIST dataset multiclass classification, SGD Classifiers trains how many binary classifiers with OvA strategy?

A) 8

B) 10

C) 12

D) 45

Ans: BBottom of Form

13. For MNIST dataset multiclass classification, SGD Classifiers train how many classifiers using OvO strategy?Top of Form

A) 8

B) 10

C) 12

D) 45

Ans: DBottom of Form

14. Classifying MNIST dataset image into [large or small, odd or even] classification lables is an example of:

A) Binary Classification

B) Multiclass Classification

C) Multilabel Classification

D) Multioutput Classification

Ans: C

15. In multilabel classification, which of the following 'average' method for calculating F1 score calculates the unweighted mean of the f1 score of individual labels?Top of Form

A) macro

B) binary

C) micro

D) weighted

Ans: A

16. In multilabel classification, which of the following average method for calculating F1 score calculates the metrics for each label, and find their average weighted by support (the number of true instances for each label)?Top of Form

A) Macro

B) Binary

C) Micro

D) Weighted

Ans: D

17. Which of the following methods would not be a good measure for skewed datasets. For example, 5 and Not 5 classiifer in MNIST has a skewed dataset in which there are more 'Not 5's as compared to '5's?

A) cross_val_score using accuracy

B) confusion matrix

C) Cross_val_score using precision

D) None

Ans: A

18. Multiclass classifiers are also known as:Top of Form

A) Mutlilabel classifiers

B) Multinomial classifiers

C) Multioutput classifiers

D) None

Ans: BBottom of Form

19. MNIST - 5 and Not 5 problem is what kind of a problem?

A) Classification

B) Regression

C) Clustering

D) None

Ans: A

20. MNIST - 5 and Not 5 Classification is what kind of a classification problem?Top of Form

A) Binary Classification

B) Multi-class

C) Multi-label

D) Multi-output

Ans: A

21. Why do we use random_state in Stochastic Gradient Descent classifier?Top of Form

A) For generating reproducible results

B) To specify the training size of the batch for each iteration

C) Both

D) None

Ans: A

22. Which of these may have to be performed before analyzing and training the dataset?Top of Form

A) Shuffling

B) Cross-Validation

C) F1 Score

D) All of the above

Ans: A

23. For the below confusion matrix, what is the total number of training datasets?

	Not 5	5
Not 5	53272	1307
5	1077	4344

Top of Form

A) 50000

B) 60000

C) 70000

D) 80000

Ans: B

24. For the below confusion matrix, what is the count of True Positive?

	Not 5	5
Not 5	53272	1307
5	1077	4344

A) 53272

B) 1077

C) 1307

D) 4344

Ans: D

25. For the below confusion matrix, what is the count of True Negatives?

	Not 5	5
Not 5	53272	1307
5	1077	4344

Top of Form

A) 53272

B) 1077

C) 1307

D) 4344

Ans: A

26. For the below confusion matrix, what is the count of False Negatives?

	Not 5	5
Not 5	53272	1307
5	1077	4344

Top of Form

A) 53272

B) 1077

C) 1307

D) 4344

Ans: B

27. For the below confusion matrix, what is the count of False Positive?

	Not 5	5
Not 5	53272	1307
5	1077	4344

Top of Form

A) 53272

B) 1077

C) 1307

D) 4344

Ans: C

28. For the below confusion matrix, what is the accuracy?

	Not 5	5
Not 5	53272	1307
5	1077	4344

A) 95%

B) 90%

C) 96%

D) 98%

Ans: C

29. For the below confusion matrix, what is the recall?

	Not 5	5
Not 5	53272	1307
5	1077	4344

A) 0.7

B) 0.8

C) 0.9

D) 0.95

Ans: B

30. For the below confusion matrix, what is the precision?

	Not 5	5
Not 5	53272	1307
5	1077	4344

A) 0.73

B) 0.76

C) 0.78

D) 0.82

Ans: B

31. F1 score is:

A) absolute mean of precision and recall

B) harmonic mean of precision and recall

C) squared mean of precision and recall

D) All of the above

Ans: B

32. For the below confusion matrix, what is the F1 score?

	Not 5	5
Not 5	53272	1307
5	1077	4344

A) 0.72

B) 0.784

C) 0.82

D) 0.84

Ans: B

33. For a model to detect videos that are unsafe for kids, we need (safe video = postive class)

A) High precision, low recall

B) High recall, low precision

C) High Precision, High Recall

D) Low Precision, Low Recall

Ans: A

34. For a model to detect shoplifters in surveillance images, we need (shoplifter is postive class)

A) High precision, low recall

B) High recall, low precision

C) High Precision, High Recall

D) Low Precision, Low Recall

Ans: B

35. Which of the following can be treated as a multi-output classification problem?

Top of Form

A) Removing noise from MNIST image

B) Classifying MNIST dataset into 0 to 9

C) Predicted the demand for rental bikes

D) None

Ans: ABottom of Form

Top of Form

Bottom of Form

36.

Bottom of Form

37.

Bottom of Form

38.

Bottom of Form

Machine Learning - Deep Learning

Machine Learning 3 UNIT-2 (A) PPTs

Precision and Recall

Confusion Matrix

Machine Learning - Classification MCQs

Machine Learning - Classification - MCQs

About Machine Learning

SOFTWARE ENGINEERING