Posts

Showing posts with the label Handwritten digit

Training a Binary Classifier

Image
  Training a Binary Classifier ·        Let’s simplify the problem for now and ü   only try to identify one digit —for example, the number 5 . ·        This “ 5-detector ” will be an example of a binary classifier , ü   capable of distinguishing between just two classes, ·        5 ·        not-5.   ü   Let’s create the target vectors for this classification task : y_train_5 = ( y_train == 5 ) # True for all 5s, False for all other digits y_test_5 = ( y_test == 5 ) ·        Now let’s pick a classifier and train it . ·        A good place to s tart is with a ü   Stochastic Gradient Descent (SGD) classifier , ·        using Scikit-Learn’s SGDClassifier class. ·        This classifier has the advantage of being capable of handling very large datasets efficiently. ·        This is in part because SGD deals with training instances independently , one at a time. ü   Let’s create an SGDClassifier and train it on the whole training set: fr

MNIST Dataset Description

Image
MNIST: MNIST ( M odified N ational I nstitute of S tandards and T echnology ): ·        MNIST dataset , ·        which is a set of 70,000 small images of digits handwritten by ü   high school students ü   employees of the US Census Bureau . Each image is labeled with the digit it represents. Figure 1. Digits from the MNIST dataset   ·        This set has been studied so much that it is often called the “ hello world ” of Machine Learning : ü   whenever people come up with a new classification algorithm , ü   they are curious to see how it will perform on MNIST , and ü    anyone who learns Machine Learning tackles this dataset sooner or later.   ü   Scikit-Learn provides many helper functions to download popular datasets . ü   MNIST is one of them. •         The following code fetches the MNIST dataset : from sklearn.datasets import fetch_openml MNIST = fetch_openml ( 'MNIST_784' , version = 1 ) MNIST . keys () •         dict_keys(['data', 'target'