Training a Binary Classifier
Training a Binary Classifier
·
Let’s simplify the problem for now and
ü only try to identify one digit
—for example, the number 5.
·
This “5-detector” will be an example of a binary
classifier,
ü capable of distinguishing
between just two classes,
·
5
·
not-5.
ü
Let’s create the target vectors for this classification task:
y_train_5 = (y_train == 5)
# True for all 5s, False for all other digits
y_test_5 = (y_test == 5)
·
Now let’s pick a classifier and train it.
·
A good place to start is with a
ü Stochastic Gradient Descent
(SGD) classifier,
·
using Scikit-Learn’s SGDClassifier class.
·
This classifier has
the advantage of being capable of handling very large datasets efficiently.
·
This is in part because SGD deals
with training instances independently, one at a time.
ü
Let’s create an SGDClassifier and train it on the whole training set:
from sklearn.linear_model import SGDClassifier
sgd_clf = SGDClassifier(random_state=42)
sgd_clf.fit(X_train, y_train_5)
ü
The SGDClassifier relies
on randomness during training (hence the name “stochastic”).
ü
If you want reproducible results,
•
you should set the random_state parameter.
ü
Now we can use it to detect images of the number 5:
sgd_clf.predict([some_digit])
array([
True])
·
The classifier guesses that this image represents a 5 (True).
Comments
Post a Comment