If you were following along in Part 1, you will have seen how we used Keras to create our model for tackling The German Traffic Sign Recognition Benchmark(GTSRB). Now in Part 2 you will see how we achieve performance close to human-level performance. You will also see how to improve the accuracy of the model using augmentation of the training data.
Now, our model is ready to train. During the training, our model will iterate over batches of the training set, each of size batch_size. For each batch, gradients will be computed and updates will be made to the weights of the network automatically. One iteration over all of the training set is referred to as an epoch. Training is usually run until the loss converges to a constant.
We will add a couple of features to our training:
These are not necessary but they improve the model accuracy. These features are implemented via the callback feature of Keras. callback are a set of functions that will applied at given stages of training procedure like end of an epoch of training. Keras provides inbuilt functions for both learning rate scheduling and model checkpointing.
fromkeras.callbacks import LearningRateScheduler, ModelCheckpoint
deflr_schedule(epoch):
returnlr*(0.1**int(epoch/10))
batch_size = 32
nb_epoch = 30
model.fit(X, Y,
batch_size=batch_size,
nb_epoch=nb_epoch,
validation_split=0.2,
callbacks=[LearningRateScheduler(lr_schedule),
ModelCheckpoint('model.h5',save_best_only=True)]
)
You’ll see that model starts training and logs the losses and accuracies:
Train on 31367 samples, validate on 7842 samples
Epoch 1/30
31367/31367 [==============================] - 30s - loss: 1.1502 - acc: 0.6723 - val_loss: 0.1262 - val_acc: 0.9616
Epoch 2/30
31367/31367 [==============================] - 32s - loss: 0.2143 - acc: 0.9359 - val_loss: 0.0653 - val_acc: 0.9809
Epoch 3/30
31367/31367 [==============================] - 31s - loss: 0.1342 - acc: 0.9604 - val_loss: 0.0590 - val_acc: 0.9825
...
Now this might take a bit of time, especially if you are running on a CPU. If you have anNvidiaGPU, you should install cuda. It speeds up the training dramatically. For example, on my Macbook air, it takes 10 minutes per epoch while on a machine with Nvidia Titan X GPU, it takes 30 seconds. Even modest GPUs offer impressive speedup because of the inherent parallelizability of the neural networks. This makes GPUs necessary for deep learning if anything big has to be done. Grab a coffee while you wait for training to complete ;).
Congratulations! You have just trained your first deep learning model.
Let’s quickly load test data and evaluate our model on it:
import pandas as pd
test = pd.read_csv('GT-final_test.csv',sep=';')
# Load test dataset
X_test = []
y_test = []
i = 0
forfile_name, class_id in zip(list(test['Filename']), list(test['ClassId'])):
img_path = os.path.join('GTSRB/Final_Test/Images/',file_name)
X_test.append(preprocess_img(io.imread(img_path)))
y_test.append(class_id)
X_test = np.array(X_test)
y_test = np.array(y_test)
# predict and evaluate
y_pred = model.predict_classes(X_test)
acc = np.sum(y_pred==y_test)/np.size(y_pred)
print("Test accuracy = {}".format(acc))
Which outputs on my system (Results may change a bit because the weights of the neural network are randomly initialized):
12630/12630 [==============================] - 2s
Test accuracy = 0.9792557403008709
97.92%! That’s great! It’s not far from average human performance (98.84%)[1].
A lot of things can be done to squeeze out extra performance from the neural net. I’ll implement one such improvement in the next section.
You might think 40000 images is a lot of images. Think about it again. Our model has 1358155 parameters (try model.count_params() or model.summary()). That’s 4X the number of training images.
If we can generate new images for training from the existing images, that will be a great way to increase the size of the dataset. This can be done by slightly:
Rather than generating and saving such images to hard disk, we will generate them on the fly during training. This can be done directly using built-in functionality of Keras.
fromkeras.preprocessing.image import ImageDataGenerator
fromkeras.preprocessing.image import ImageDataGenerator
fromsklearn.cross_validation import train_test_split
X_train, X_val, Y_train, Y_val = train_test_split(X, Y, test_size=0.2, random_state=42)
datagen = ImageDataGenerator(featurewise_center=False,
featurewise_std_normalization=False,
width_shift_range=0.1,
height_shift_range=0.1,
zoom_range=0.2,
shear_range=0.1,
rotation_range=10.,)
datagen.fit(X_train)
# Reinitialize model and compile
model = cnn_model()
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
# Train again
nb_epoch = 30
model.fit_generator(datagen.flow(X_train, Y_train, batch_size=batch_size),
samples_per_epoch=X_train.shape[0],
nb_epoch=nb_epoch,
validation_data=(X_val, Y_val),
callbacks=[LearningRateScheduler(lr_schedule),
ModelCheckpoint('model.h5',save_best_only=True)]
)
With this model, I get 98.29% accuracy on the test set.
Frankly, I haven’t done much parameter tuning. I’ll make a small list of things which can be tried to improve the model:
This is but a model for beginners. For state-of-the-art solutions of the problem, you can have a look at this, where the authors achieve 99.61% accuracy with a specialized layer called Spatial Transformer layer.
In this two-part post, you have learned how to use convolutional networks to solve a computer vision problem. We used the Keras deep learning framework to implement CNNs in Python. We have achieved performance close to human-level performance. We also have seen a way to improve the accuracy of the model using augmentation of the training data.
References:
Sasank Chilamkurthy works at Qure.ai. His work involves deep learning on medical images obtained from radiology and pathology. He completed his UG in Mumbai at the Indian Institute of Technology, Bombay. He can be found on Github at here.
I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…
Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…
Once we learn how to deploy an Ubuntu server, how to manage users, and how…
Key-takeaways: Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…
While developing a web application, or setting dynamic pages and meta tags we need to deal with…
Software architecture is one of the most discussed topics in the software industry today, and…