Tuesday, September 25, 2018

Classification --- Deep Learning

We can try a deep learning algorithm by utilizing Keras on Tensorflow or CNTK. However, understanding what is going on inside neural networks is not an easy task. Visualizing how input data is converted into a simple array that describes classification category helps you figure out its structure.


Assume you want to classify a vehicle image into 12 category name, Accord, Altima, Camry, Corolla, Elantra, Freed, HR-V, Insight, Jazz, Prius, Roadster, S2000. When you input an RGB image with 128 * 128 pixels, the predicted outcome is one dimensional array of 12 elements which is in between 0 to 1. The neural networks are trained so the target values of an output array is below. This is an example of the target value when the input image is 'Jazz'.

Let's input a 'Jazz' image below to a trained deep learning model.


First of all, prepare images for training. The directory structure is like this. All images in the subfolder is RGB images with 128 * 128 pixels.
The code used to train using Keras on Tensorflow is this.

The model that I trained can be described like below by using this code. It shows how each layers transform input arrays.




The details of model can be described by using model.summary()
The output looks like this

Input a 'Jazz.jpg' and predict on the model trained above
This is an output of the prediction

When you want to describe an intermediate output, the code is like this

The figure below describes how outputs after each layer looks like