
The neurons in this layer should be equal to the number of classes we want to predict as this is the output layer. The second last layer is the Dense layer with 10 neurons. In our model, dropout will randomnly disable 20% of the neurons.

It forces the model to learn multiple independent representations of the same data by randomly disabling neurons in the learning phase. These layers give the ability to classify the features learned by the CNN.ĭropout is the method used to reduce overfitting. It reduces the training time significantly.Īfter creating all the convolutional layers, we need to flatten them, so that they can act as an input to the Dense layers.ĭense layers are keras’s alias for Fully connected layers. BatchNormalization normalizes the matrix after it is been through a convolution layer so that the scale of each dimension remains the same. It’s a best practice to always do BatchNormalization. It also reduces the number of parameters to learn, reducing the training time. MaxPooling layer is used to down-sample the input to enable the model to make assumptions about the features so as to reduce over-fitting. It is the most used activation function since it reduces training time and prevents the problem of vanishing gradients. It sets all negative values in the matrix ‘x’ to 0 and keeps all the other values constant. ReLU function is f(x) = max(0, x), where x is the input. We have used ReLU (rectified linear unit) as our activation function. The second layer is the Activation layer. We also need to specify the shape of the input which is (28, 28, 1), but we have to specify it only once. So, in our first layer, 32 is number of filters and (3, 3) is the size of the filter. Keras allows us to specify the number of filters we want and the size of the filters. add ( Flatten ()) # Fully connected layer model. add ( MaxPooling2D ( pool_size = ( 2, 2 ))) model. add ( BatchNormalization ( axis =- 1 )) model. After that make a fully connected network # This fully connected network gives ability to the CNN # to classify the samples model = Sequential () model. Pooling # Repeat Steps 1,2,3 for adding more hidden layers # 4. import numpy as np import pandas as pd import matplotlib.pyplot as plt % matplotlib inline from keras.datasets import mnist from keras.models import Sequential from keras.layers import Dense, Dropout, Activation, Flatten from keras.optimizers import Adam from import BatchNormalization from keras.utils import np_utils from keras.layers import Conv2D, MaxPooling2D, ZeroPadding2D, GlobalAveragePooling2D from _activations import LeakyReLU from import ImageDataGenerator Implementationįirst, we import all the necessary libraries required. If you want to explore the tensorflow implementation of the MNIST dataset, you can find it here. Keras makes everything very easy and you will see it in action below.

Keras was written to simplify the construction of neural nets, as tensorflow’s API is very verbose. To learn more about it, visit there official website. Tensorflow was developed by the Google Brain team. Keras is a high-level neural network API, written in Python which runs on top of either Tensorflow or Theano. Each filter in a CNN, learns different characteristic of an image. A convolution operation takes place between the image and the filter and the convolved feature is generated. In the above example, the image is a 5 x 5 matrix and the filter going over it is a 3 x 3 matrix. dot product of the image matrix and the filter. What is a Convolutional Neural Network?Ī convolution in CNN is nothing but a element wise multiplication i.e. We will use the Keras library with Tensorflow backend to classify the images. It has 60,000 grayscale images under the training set and 10,000 grayscale images under the test set. MNIST dataset contains images of handwritten digits. In this article, we will achieve an accuracy of 99.55%. The state of the art result for MNIST dataset has an accuracy of 99.79%.

MNIST is the most studied dataset ( link). From 2012, CNN’s have ruled the Imagenet competition, dropping the classification error rate each year. It is being used in almost all the computer vision tasks. Convolutional Neural Networks have changed the way we classify images.
