Deep neural networks: Bridging between theory and practice

Recently, Packt signed up to offer print and ebook bundling through BitLit so that our readers can easily access their books in any format. BitLit is an innovative app that allows readers to bundle their books retroactively. Instead of relying on receipts, BitLit uses computer vision to identify print books by their covers and a reader by their signature. All you need to bundle a book with BitLit is a pen, your smartphone, and the book.

Packt is really excited to have partnered with BitLit to offer bundling to our readers. We’ve asked BitLit’s Head of R&D, Sancho McCann, to give our readers a deeper dive on how BitLit uses pre-existing research on deep neural networks.

Deep neural networks: Bridging between theory and practice

What do Netflix recommendations, Google's cat video detector, and Stanford's image-to-text system all have in common? A lot of training data, and deep neural networks.

This won’t be a tutorial about how deep neural networks work. There are already excellent resources for that (this one by Andrej Karpathy, for example). But, even with a full understanding of how deep neural nets, and even if you can implement one, bridging the gap between prototype implementation and a production-ready system may seem daunting. The code needs to be robust, flexible, and optimized for the latest GPUs. Fortunately, this work has already been done for you. This post describes how to take advantage of that pre-existing work.

Software

There is a plethora of deep neural network libraries available. Caffe, CUDA-Convnet, Theano, and others. At BitLit, we have selected Caffe. Its codebase is actively developed and maintained. It has an active community of developers and users. It has a large library of layer types and allows easy customization of your network’s architecture. It has already been adapted to take advantage of NVIDIA’s cuDNN, if you happen to have it installed.

cuDNN is “a GPU-accelerated library of primitives for deep neural networks”. This library provides optimized versions of core neural network operations (convolution, rectified linear units, pooling), tuned to the latest NVIDIA architectures. NVIDIA’s benchmarking shows that Caffe accelerated by cuDNN is 1.2-1.3x faster than the baseline version of Caffe.

In summary, the tight integration of NVIDIA GPUs, CUDA, cuDNN, and Caffe, combined with the active community of Caffe users and developers is why we have selected this stack for our deep neural network systems.

Hardware

As noted by Krizhevsky et al. in 2012, “All of our experiments suggest that our results can be improved simply by waiting for faster GPUs… ” This is still true today. We use both Amazon’s GPU instances and our own local GPU server.

When we need to run many experiments in parallel, we turn to Amazon. This need arises when performing model selection. To determine how many neural net layers to use, how wide each layer should be, etc., we run many experiments in parallel to determine which network architecture produces the best results. Then, to fully train (or later, retrain) the selected model to convergence, we use our local, faster GPU server.

deep-neural-networks-bridging-between-theory-and-practice-img-0

[Selecting the best model via experimentation.]

Amazon’s cheapest GPU offering is their g2.2xlarge instance. It contains an NVIDIA Kepler GK104 (1534 CUDA cores). Our local server, with an NVIDIA Tesla K40 (2880 CUDA cores), trains about 2x as quickly as the g2.2xlarge instance. NVIDIA’s latest offering, the K80, is again almost as twice as fast, benchmarked on Caffe. If you’re just getting started, it certainly makes sense to learn and experiment on an Amazon AWS instance before committing to purchasing a GPU that costs several thousand dollars. The spot price for Amazon’s g2.2xlarge instance generally hovers around 8 cents per hour.

If you are an academic research institution, you may be eligible for NVIDIA’s Academic Hardware Donation program. They provide free top-end GPUs to labs that are just getting started in this field.

It’s not that hard!

To conclude, it is not difficult to integrate a robust and optimized deep neural network in a production environment. Caffe is well supported by a large community of developers and users. NVIDIA realizes this is an important market and is making a concerted effort to be a good fit for these problems. Amazon’s GPU instances are not expensive and allow quick experimentation.

Additional Resources

About the Author

Sancho McCann (@sanchom) is the Head of Research and Development at BitLit Media Inc. He has a Ph.D. in Computer Vision from the University of British Columbia.