Precision and Recall

Precision and Recall

Scikit-Learn provides several functions to compute classifier metrics, including precision and recall:

from sklearn.metrics import precision_score, recall_score

precision_score(y_train_5, y_train_pred) # == 4096 / (4096 + 1522)

0.7290850836596654

recall_score(y_train_5, y_train_pred) # == 4096 / (4096 + 1325)

0.7555801512636044

· Now your 5-detector does not look as shiny as it did when you looked at its accuracy.

· When it claims an image represents a 5, it is correct only 72.9% of the time.

· Moreover, it only detects 75.6% of the 5s.

• It is often convenient to combine

ü precision and

ü recall into a single metric called the F1 score,

ü in particular if you need a simple way to compare two classifiers.

• The F1 score is the harmonic mean of precision and recall.

F1Score

• Whereas the regular mean treats all values equally,

ü the harmonic mean gives much more weight to low values.

• As a result, the classifier will only get a high F1 score if

ü both recall and precision are high.

• To compute the F1 score, simply call the f1_score() function:

from sklearn.metrics import f1_score

f1_score(y_train_5, y_train_pred)

0.7420962043663375

· The F1 score favors classifiers that have similar

ü precision and

ü recall.

· This is not always what you want:

ü in some contexts you mostly care about precision, and

ü in other contexts you really care about recall.

· For example, if you trained a classifier to detect videos that are safe for kids,

ü you would probably prefer a classifier that rejects many good videos (low recall)

ü but keeps only safe ones (high precision),

ü rather than a classifier that has a much higher recall but lets a few really bad videos show up in your product.

· On the contrary, however suppose

· you train a classifier to detect shoplifters in surveillance images:

ü it is probably fine if your classifier has only 30% precision as long as it has 99% recall

ü the security guards will get a few false alerts,

ü but almost all shoplifters will get caught.

· Unfortunately, you can’t have it both ways:

i. increasing precision reduces recall, and

ii. vice versa.

ü This is called the precision/recall trade-off.

Machine Learning - Deep Learning