These days Convolutional neural networks (CNNs/ConvNets) is one of the hot topics for computer science research that is also in the minds of business and IT leaders specially for its use in developing business as well as scientific applications. It can be seen as advanced machine learning technique or a deep learning technique. If you are not aware about machine learning (ML) and deep learning (DL) then don’t worry there will be a glimpse of both, in further sections, along with CNN & its implementation in Python. Most widely CNNs can be used in the applications related to natural language processing, image as well as video recognition. In this article, we will explore CNN along with the architecture behind it and an example, related to image classifier, implemented in Python, assuming you have basic understanding of the working of neural networks. We will altogether walk through this journey so that you can understand the working of CNNs deeply. Let’s start our journey:
Evolution of Machine Learning (ML) & Deep Learning (DL)
Before digging the concept of CNNs, we need to have a quick look on the evolution of ML & DL. Machine learning (ML), an application of artificial intelligence, is no longer an unheard topic, however, it has tremendously evolved over the last decade. Much of the credit for the same goes to the advancement of hardware as well as algorithms. The evolution of ML starts from pattern recognition that provides computer systems the ability to learn from data and improve from experience without being explicitly programmed to perform specific tasks.
On the other hand, the evolution of deep learning starts from the advancements in neural networks. As being a type of model for machine learning, in mid-1980s & early 1990s neural networks have gone through some explosive computational growth as well as architectural advancements. That is why Deep learning, emerged from that computational growth, is a particular kind of machine learning whose algorithms are inspired by the functionality of human brain.
You may also like to read – Machine Learning in Education
Check our Machine learning archive – Machine learning archive
Check our Artificial Intelligence archive – AI archive
Elucidation of Convolutional Neural Networks (CNNs)
Convolutional neural networks (CNNs) are similar to ordinary neural networks (NNs) in the manner that they are also made up of neurons that have learnable weights and biases. Recalling the working of ordinary neural networks, each neuron in the network receives one or more inputs, takes a weighted sum which is further passed through an activation function to produce the final output. If CNNs and ordinary NNs have so many similarities then the question arises what really makes them different to each other?
The treatment of input data and types of layers makes them different to each other. Ordinary neural networks ignore the structure of input data and all the data is converted into 1-D array before feeding it into the network. On the other hand, CNN’s architecture is designed in such a way that it can take into account the 2D structure of the images (or any other 2D input like speech signals) while processing them and also allow it to extract the properties specific to images. Further, if we talk about the layers then CNNs have the advantage of having one or more Convolutional layers and pooling layer (main building blocks of CNNs) followed by one or more fully connected layers as in standard multilayer neural networks. It means we can think of CNN as a special case of fully connected networks. Interesting, isn’t it?
Architecture of Convolutional Neural Networks (CNNs)
CNN architecture is simply a list of layers transforming the 3D (width, height and depth) image volume into a 3D output volume. Each neuron in the current layer is connected to a small patch of the output from the previous layer. It is similar to overlaying a filter on the input image. It uses M filters to be sure about getting all the details. These M filters are basically feature extractors that extract features like edges, corner and so on. Before going deep into architecture, I would like to explain the layers [INPUT-CONV-RELU-POOL-FC] that are used to construct CNNs:
- INPUT: This layer, as the name implies, will hold the raw pixel values i.e. data of the image as it is. For example, INPUT [64×64×3] means the RGB image of width 64, height 64 and depth 3. It is a 3-channeled image.
- CONV: Most of the computation, convolutions between neurons and various patches in the input, is done in Convolutional layer hence it is one of the building blocks of CNNs. For example, if we decided to use 6 filters on the above mentioned input then this may result in the volume [[64×64×6].
- RELU: It is called rectified linear unit layer which basically applies an activation function to the output of previous layer. A non-linearity would be added to the network by RELU.
- POOL: Pooling layer is another building block of CNN. Its main task is down-sampling which it operates independently on every slice of the input and resizes it spatially.
- FC: Fully connected layer or more specifically called output layer is used to compute output class score. The resulting output is volume of the size where L is the number corresponding to class score.
Following diagram represents the typical architecture of CNNs:
Example: An Image classifier implemented in Python
Here, we will implement image classifier using CNN in Python. The basic concept would be same so it can be applied to applications like natural language processing (NLP), video recognition and any other use case too. For this implementation, we have the following prerequisites:
- Keras: A deep learning library: – It is a high level neural network API which is written in Python and capable of running on the top of TensorFlow, CNTK or Theno. If you want to learn it in detail then go to the link https://keras.io/. Installation of Keras deep learning library is must before start using it. Following commands can be used to install it:
pip install Keras
On conda environment the command would be as follows:
conda install –c conda-forge Keras
- Training & testing data set of images: – We also need training & testing data set of images. I am using the images of cats and dogs from the link https://www.kaggle.com/c/dogs-vs-cats/data.
Now, following is Python script to implement image classifier:
Listing 1: Python script to implement image classifier
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
Img_classifier = Sequential()
Img_classifier.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3, activation = ‘relu’))
Img_classifier.add(MaxPooling2D(pool_size = (2, 2)))
S_classifier.add(Dense(units = 128, activation = ‘relu’))
Img_classifier.add(Dense(units = 1, activation = ‘sigmoid’))
Img_classifier.compile(optimizer = ‘adam’, loss = ‘binary_crossentropy’, metrics = [‘accuracy’])
training_datagen = ImageDataGenerator(rescale = 1./255,shear_range = 0.2, zoom_range = 0.2,horizontal_flip = True)
testing_datagen = ImageDataGenerator(rescale = 1./255)
training_set = training_datagen.flow_from_directory(”/Users/admin/training_set”,target_size = (64, 64),batch_size = 32,class_mode = ‘binary’)
test_set = testing_datagen.flow_from_directory(‘test_set’,target_size = (64, 64),batch_size =32,class_mode = ‘binary’)
classifier.fit_generator(training_set,steps_per_epoch = 8000,epochs = 25,validation_data = test_set,validation_steps = 2000)
from keras.preprocessing import image
test_image = image.load_img(‘dataset/single_prediction/cat_or_dog_1.jpg’, target_size = (64, 64))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis = 0)
result = classifier.predict(test_image)
if result == 1:
prediction = ‘dog’
prediction = ‘cat’
I hope you have enjoyed this short journey of learning the working process of CNN with me. Try to extend this journey by building your own CNN network and apply the same to different applications too. If you come across any difficulty while implementing CNNs in Python, or you have any suggestions / feedback please feel free to post them in the comment section below. I would love to hear from you.
You may also like to read the following interesting stories.
- Deep learning helps industry to grow – How?
- Future of machine learning
- Machine learning and fintech industry
- Future of Machine learning – Let’s explore
Author Bio: Gaurav Leekha is a passionate technical content writer, primarily for the field of Artificial Intelligence (AI), Machine Learning (ML), Deep Learning, Artificial Neural Networks (ANN), Speech Processing, and Python. He has 7+ years’ experience of teaching in the field of computer science. He is also pursuing Ph.D. degree in the field of Machine learning and along with that serving as the reviewer of various national as well as international journals including International Journal of speech technology (IJST), Springer.