2. Performance Metrics in Machine Learning

Performance Metrics in Machine Learning

Evaluating the performance of a machine learning model is a key stage in developing an effective ML model. Different metrics are used to evaluate the model's performance or quality, and these measures are known as performance metrics or evaluation metrics. These performance measures allow us to see how well our model performed with the given data. By changing the hyper-parameters, we can increase the model's performance. Each ML model seeks to generalize well on previously unseen/new data, and performance metrics assist determine how well the model generalizes on the new dataset.

Each task or problem in machine learning is classified into two categories: classification and regression. Because not all metrics can be utilized for all sorts of situations, it is critical to understand which metrics should be employed. Different assessment metrics are utilized for both Regression and Classification activities.

Performance Metrics for Classification
Performance Metrics for Regression

Performance Metrics for Classification

Accuracy
Confusion Matrix
Precision
Recall
F-Score
AUC(Area Under the Curve)-Receiver Operating Characteristic Curve (ROC)

Performance Metrics for Classification

Classification problems are one of the world’s most widely researched areas. Use cases are present in almost all production and industrial environments. Speech recognition, face recognition, text classification – the list is endless.

Classification models have discrete output, so we need a metric that compares discrete classes in some form. Classification Metrics evaluate a model’s performance and tell you how good or bad the classification is, but each of them evaluates it in a different way.

Accuracy

𝑨𝒄𝒄𝒖𝒓𝒂𝒄𝒚=(𝑻𝒐𝒕𝒂𝒍 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒄𝒖𝒓𝒓𝒆𝒄𝒕𝒍𝒚 𝒄𝒍𝒂𝒔𝒔𝒊𝒇𝒊𝒆𝒅 𝒑𝒐𝒊𝒏𝒕𝒔)/(𝑻𝒐𝒕𝒂𝒍 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒑𝒐𝒊𝒏𝒕 𝒊𝒏 𝒕𝒉𝒆 𝑫_𝒕𝒆𝒔𝒕 𝒐𝒓 𝑻𝒆𝒔𝒕 𝑫𝒂𝒕𝒂)
Accuracy values are between 0 (bad) to 1(good).
Assume 𝑫_𝒕𝒆𝒔𝒕 has 100 points.
There are 60 positives and 40 negatives.
Our model predicts 53 positive classes and 7 negative classes for positive points.
And for the 40 negative classes, if our model predicted 35 as negative and 5 as positive.
Then, the model predicts Correctly classified as 88 and incorrectly classified as 12.
Our model has an accuracy of 88%. (0.88)

𝑨𝒄𝒄𝒖𝒓𝒂𝒄𝒚=(𝑻𝒐𝒕𝒂𝒍 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒄𝒖𝒓𝒓𝒆𝒄𝒕𝒍𝒚 𝒄𝒍𝒂𝒔𝒔𝒊𝒇𝒊𝒆𝒅 𝒑𝒐𝒊𝒏𝒕𝒔)/(𝑻𝒐𝒕𝒂𝒍 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒑𝒐𝒊𝒏𝒕 𝒊𝒏 𝒕𝒉𝒆 𝑫_𝒕𝒆𝒔𝒕 )=(𝟓𝟑+𝟑𝟓)/𝟏𝟎𝟎∗𝟏𝟎𝟎=𝟖𝟖/𝟏𝟎𝟎∗𝟏𝟎𝟎=𝟖𝟖%

It is recommended not to use the Accuracy measure when the target variable majorly belongs to one class.
For example, Suppose there is a model for a disease prediction in which, out of 100 people, only ten people have a disease, and 90 people don't have one.
In this case, if our model predicts every person with no disease (which means a bad prediction), the Accuracy measure will be 90%, which is not correct.
I.e.. You should never use accuracy as a measure when you have imbalanced data.

Confusion Matrix:

A confusion matrix is a tabular representation of any binary classifier's prediction outcomes that is used to describe the classification model's performance on a set of test data when true values are known.

The confusion matrix is simple to implement, but the terminologies used in this matrix might be confusing for beginners.

A typical confusion matrix for a binary classifier looks like the below table.

Where

A: Number of points such that 𝒀_𝒊=𝟎 and 𝒀 ̂_𝒊=𝟎
B: Number of points such that 𝒀_𝒊=𝟏 and 𝒀 ̂_𝒊=𝟎
C: Number of points such that 𝒀_𝒊=𝟎 and 𝒀 ̂_𝒊=𝟏
D: Number of points such that 𝒀_𝒊=𝟏 and 𝒀 ̂_𝒊=𝟏

Machine Learning - Deep Learning