Recognizing Handwritten Digits

3 min readDec 25, 2020

The hypothesis to be tested: The Digits data set of scikit-learn library provides numerous datasets that are useful for testing many problems of data analysis and prediction of the results.

Libraries used: scikit-learn, matplotlib

Dataset: Optical recognition of handwritten digits dataset. the Dataset consists of 10 classes where each class refers to a digit from 0 to 9. This dataset has 1,797 images that are 8x8 pixels in size.

Figure 1: An Example of 8x8 dimension Digit Dataset

#Importing the Library
from sklearn import datasets#Loading the Dataset in a variable named digits
digits = datasets.load_digits()#Description of the dataset
print(digits.DESCR)# digits.images array contains images of handwritten digits
digits.images[0]
array([[ 0.,  0.,  5., 13.,  9.,  1.,  0.,  0.],
       [ 0.,  0., 13., 15., 10., 15.,  5.,  0.],
       [ 0.,  3., 15.,  2.,  0., 11.,  8.,  0.],
       [ 0.,  4., 12.,  0.,  0.,  8.,  8.,  0.],
       [ 0.,  5.,  8.,  0.,  0.,  9.,  8.,  0.],
       [ 0.,  4., 11.,  0.,  1., 12.,  7.,  0.],
       [ 0.,  2., 14.,  5., 10., 12.,  0.,  0.],
       [ 0.,  0.,  6., 13., 10.,  0.,  0.,  0.]])plt.imshow(digits.images[1010], cmap=plt.cm.gray_r, interpolation='nearest')

Figure 2: 1011th image in Digit’s Dataset

# digits.targets array contains labels of handwritten digits
digits.target
array([0, 1, 2, ..., 8, 9, 8])#Shape of arrays
digits.images.shape
(1797, 8, 8)digits.target.shape
(1797,)# Flatten data.images
n_samples = len(digits.images)
data = digits.images.reshape((n_samples, -1))
digits.data.shape
(1797, 64)# Spliting the dataset into train and test set
x_train, x_test, y_train, y_test = train_test_split(digits.data, digits.target, test_size = 0.2)

In this article, we will be classifying the digits dataset using 3 algorithms.

KNeighbors Classifier
Support Vector Machine (SVM)
Logistic Regression

1. KNeighbors Classifier Implementation:

#Import the necessary libraries
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt# Spliting ratio: 0.2
x_train, x_test, y_train, y_test = train_test_split(digits.data, digits.target, test_size = 0.2)model = KNeighborsClassifier(n_neighbors = 5)
model.fit(x_train, y_train)
y_pred = model.predict(x_test)
accuracy_score(y_test, y_pred)*100
99.62962962962963

2. Support Vector Machine (SVM) Implementation:

#Import the necessary libraries
from sklearn import svm, datasets
import matplotlib.pyplot as plt
from sklearn import metrics
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn import svm# Spliting ratio: 0.2
x_train, x_test, y_train, y_test = train_test_split(digits.data, digits.target, test_size = 0.2)svc.fit(x_train, y_train)
y_pred= svc.predict(x_test)
accuracy_score(y_test, y_pred)*100
99.44444444444444

3. Logistic Regression Implementation:

#Import the necessary libraries
from sklearn.datasets import load_digits
from sklearn import datasets
import numpy as np 
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression
digits = datasets.load_digits()# Spliting ratio: 0.2
x_train, x_test, y_train, y_test = train_test_split(digits.data, digits.target, test_size = 0.2)logisticRegr = LogisticRegression()
logisticRegr.fit(x_train, y_train)
y_pred =  logisticRegr.predict(x_test)
accuracy_score(y_test, y_pred)*100
95.83333333333334

Comparison of the 3 Algorithm

Figure 3: Comparision of various algorithm

Conclusion: In this article, we have implemented and compared 3 algorithms with various train and test sets to recognize handwritten digits using sklearn library.

I am thankful to mentors at https://internship.suvenconsultants.com for providing awesome problem statements and giving many of us a Coding Internship Experience. Thank you www.suvenconsultants.com

Recognizing Handwritten Digits

1. KNeighbors Classifier Implementation:

Written by Lumbini Inkar

No responses yet