Training a Binary Classifier

Training a Binary Classifier

· Let’s simplify the problem for now and

ü only try to identify one digit

—for example, the number 5.

· This “5-detector” will be an example of a binary classifier,

ü capable of distinguishing between just two classes,

· 5

· not-5.

ü Let’s create the target vectors for this classification task:

y_train_5 = (y_train == 5)

# True for all 5s, False for all other digits

y_test_5 = (y_test == 5)

· Now let’s pick a classifier and train it.

· A good place to start is with a

ü Stochastic Gradient Descent (SGD) classifier,

· using Scikit-Learn’s SGDClassifier class.

· This classifier has the advantage of being capable of handling very large datasets efficiently.

· This is in part because SGD deals with training instances independently, one at a time.

ü Let’s create an SGDClassifier and train it on the whole training set:

from sklearn.linear_model import SGDClassifier

sgd_clf = SGDClassifier(random_state=42)

sgd_clf.fit(X_train, y_train_5)

ü The SGDClassifier relies on randomness during training (hence the name “stochastic”).

ü If you want reproducible results,

• you should set the random_state parameter.

ü Now we can use it to detect images of the number 5:

sgd_clf.predict([some_digit])

array([ True])

· The classifier guesses that this image represents a 5 (True).

YouTube Link: https://youtu.be/AWI2qUUkPK8

Machine Learning - Deep Learning