Training a Binary Classifier

 Training a Binary Classifier

·       Let’s simplify the problem for now and

ü  only try to identify one digit

—for example, the number 5.

·       This “5-detector” will be an example of a binary classifier,

ü  capable of distinguishing between just two classes,

·       5

·       not-5.

 



ü  Let’s create the target vectors for this classification task:

y_train_5 = (y_train == 5)

# True for all 5s, False for all other digits

y_test_5 = (y_test == 5)

·       Now let’s pick a classifier and train it.

·       A good place to start is with a

ü  Stochastic Gradient Descent (SGD) classifier,

·       using Scikit-Learn’s SGDClassifier class.

·       This classifier has the advantage of being capable of handling very large datasets efficiently.

·       This is in part because SGD deals with training instances independently, one at a time.

ü  Let’s create an SGDClassifier and train it on the whole training set:

from sklearn.linear_model import SGDClassifier

sgd_clf = SGDClassifier(random_state=42)

sgd_clf.fit(X_train, y_train_5)

ü  The SGDClassifier relies on randomness during training (hence the name “stochastic”).

ü  If you want reproducible results,

        you should set the random_state parameter.

ü  Now we can use it to detect images of the number 5:

sgd_clf.predict([some_digit])

            array([ True])

·       The classifier guesses that this image represents a 5 (True). 


YouTube Link: https://youtu.be/AWI2qUUkPK8


Comments