Handwritten Digit Recognition

using Multilayer Neural Network

Sneha Reddy, Tanishka Vegunta

Department

of Information Technology, Chaitanya Bharathi Institute of Technology,

Hyderabad, India

Abstract

We will write a custom essay sample on

Handwritten Digit Recognition using Multilayer Neural Network Sneha**Specifically for you for only $16.38 $13.9/page**

order now

Recognition

of Handwriting by humans may seem as a very easy task but when done by a

machine, it is a very complex one. It is unproductive for humans to spend a lot

of time trying to recognize characters in order to analyze any collected data.

Our main focus should be on analyzing the data rather than trying to recognize

the characters. Apart from this, the manual recognition of characters may not

yield the right results since it may vary from person to person. Hence, it is

not accurate to a great extent and may take a lot of time and energy.

Algorithms using neural networks have made this task a lot easier and more

accurate. Therefore, neural networks have been utilized with an aim to

determine the characters by training a neural network. In this paper, we

discuss the recognition of handwritten digits taken from the MNIST data set and

check the accuracy of our implementation. This is done by training a neural

network using stochastic gradient descent and backpropagation.

Keywords

Digit recognition, Backpropagation, Mini batch

Stochastic Gradient

INTRODUCTION

Handwriting

is a form of writing peculiar to a person with variations in size, shape of

letters, spacing between letters. There are different styles of handwriting

including cursive, block letters, calligraphy, signature etc. This makes the

task of recognizing handwritten characters complex when using traditional rule

based programming. The task becomes more natural when it is approached from a

machine learning perspective by using neural networks. According to Tom

Mitchell “A computer program is said to learn from experience E with

respect to some class of tasks T and performance measure P, if its performance

at tasks in T, as measured by P, improves with experience E.”1

A

neural network consists of neurons which are simple processing units and there

are directed, weighted connections between these neurons.

For a

neuron j, propagation function receives the outputs of other neurons and

transforms them in consideration of the weights into the network input that can

be further processed by the activation function. 2

Mini

batch gradient descent used in the paper is a combination of batch gradient

descent and stochastic gradient descent algorithms. It calculates model error

rate by splitting data set into small batches.

The

backpropagation algorithm used in this paper is used for adjusting the weights

in the neural network. The algorithm works by comparing the actual output and

the desired output for a given input and calculates error value. The weights

are adjusted based on the error value. The error is first calculated at the

output layer and then distributed for the other layers.

MATERIALS AND METHODS

Digit recognition is done by training a

multi-layer feedforward neural network by using mini batch stochastic gradient

descent and backpropagation algorithm.

The MNIST data set obtained from 3 contains a

modified version of the original training set of 60,000 images. The original

training set is split into a training set with 50,000 examples and a validation

set with 10,000 examples. This set is then used to train the neural network. Each image is

represented as numpy 1-dimensional array of 784 float values between 0 and 1.

The labels are numbers between 0 and 9 indicating which digit the image

represents. 3

mnist data set example

here

An

artificial neural network with sigmoid neurons is implemented. Therefore, the output

of each neuron is calculated using the sigmoid function.

The output of each neuron is given as. Where, w

is the weight, b is the bias and x is the input.

Initially, the weights and biases of the neural

network are initialized randomly using Gaussian distribution. They are later

adjusted by applying mini batch stochastic gradient descent and backpropagation.

The training data is split into a number of mini

batches. In each epoch, the training data is shuffled and split into mini

batches of a fixed size and gradient descent is applied. The neural network is

trained for a number of epochs. The labels generated for the training data in

each epoch are compared to the actual labels and cost function is calculated. The

gradient of the cost function is calculated by using the backpropagation

algorithm. This calculated gradient is then used to update the weights and

biases of the neural network. Starting from the output layer and moving

backwards, the biases and weights between connections are adjusted. The digits are labelled based on which neuron has

the highest activation out of the output layer neurons.

After training the network during each epoch,

the trained network is tested using the 10,000 test images. The labels

generated by the neural network are compared to the class labels given in the

MNIST test data. The number of correctly generated labels is identified.

RESULTS AND DISCUSSION

Figure

2 Results

The above results are obtained when the number

of epochs is set to 30, the mini batch size is 10 and the learning rate is 3.0.

The accuracy is calculated by identifying the number of correctly identified

images out of the 10,000 test images in the MNIST data set. The given results

are taken as the best out of five trials.

The accuracy peaks at 95.00 % at the 28th

epoch. The accuracy increases rapidly in the beginning with each successive

epoch. The accuracy becomes steady after a certain point and it continues with

approximately the same accuracy.

CONCLUSION

Neural networks are an effective technique for

identification of handwritten digits. The accuracy of a neural network in

handwriting recognition is quite high and they can still achieve higher

accuracy by optimizing certain parameters. In the current implementation using

mini batch stochastic gradient descent and backpropagation, an accuracy of 95%

was obtained in one of the trial runs.

ACKNOWLEDGEMENT

Thanks to our project guide Ms K. Sugamya, CBIT.

REFERENCES

1 Machine Learning: Hands-On for Developers and Technical Professionals

2 A Brief

Introduction to Neural Networks : David Kriesel

3http://www.deeplearning.net/tutorial/gettingstarted.html