Train an Image Recognizer


You learned that a picture is just a grid of numbers, and that recognizing images is a features → label problem. Today you cash that in: you’ll train an AI to read handwritten digits — and you’ll use the exact same five-step recipe you used for the penguins. The features are just pixels now.

💡 In Colab.

The same recipe, on images

Remember the five steps: data → features & label → split → fit → score. Here it is on the digits:

from sklearn.datasets import load_digits
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split

# 1. Get the data
digits = load_digits()

# 2. Features (the 64 pixels) and label (the digit 0-9)
X = digits.data
y = digits.target

# 3. Split into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

# 4. Train
model = KNeighborsClassifier()
model.fit(X_train, y_train)

# 5. Score on unseen images
print("Accuracy:", model.score(X_test, y_test))

Run it. The accuracy should be very high — often around 0.98. Your model reads handwritten digits it never saw before, almost perfectly. And notice: this is the same code as the penguins, with X = digits.data swapped in. One recipe, totally different problem.

Why it works so well

KNeighborsClassifier compares a new digit’s 64 pixels to the digits it studied and picks the most similar ones. Handwritten digits of the same number really do look similar (lots of overlapping bright pixels), so “find the nearest examples” works great here.

Try it 🎯

  1. Change test_size to 0.3 and re-run. Still high?
  2. Try KNeighborsClassifier(n_neighbors=1) (compare to just the single closest image). Better or worse?

You trained an image AI

Step back and notice what you did: with about ten lines, you trained a model that recognizes handwriting — the same kind of task that powers mail-sorting machines and check readers. No rules about what a “7” looks like; it learned from examples, exactly like the big AIs from the start of this phase.

Think about it 🔮

The penguin code and the digit code are almost identical. What does that tell you about machine learning? (The recipe is the same no matter the problem — get data, pick features and label, split, fit, score. Once you know it, you can tackle penguins, digits, or anything you can turn into features and labels.)

Fix the bug 🐞

This trains on the digit images but crashes, because it uses digits.images (the 8×8 grids) instead of digits.data (the flattened 64-number rows the model expects):

X = digits.images
y = digits.target
model.fit(X, y)

(The model wants each example as a flat row of features. Use X = digits.data — the same images, flattened to 64 numbers each.)

Your mission 🚀

Train the digit recognizer and print its accuracy. Then experiment: try different test_size values and different n_neighbors numbers, and find the combination that gives the highest accuracy. Write down your best score.

What you learned today

  • Recognizing images uses the same five-step recipe as any other model.
  • The features are the pixels (digits.data); the label is the digit (digits.target).
  • A simple KNeighborsClassifier reads handwritten digits with ~98% accuracy.
  • Use digits.data (flat rows), not digits.images (8×8 grids), to train.

Next time, we look at which digits it gets wrong, and try to make it even better. 🔍