5 min read

Tools to perform Deep Learning tasks are in abundance. You have programming languages that are adapted for the job or those specifically created to get the job done. Then, you have several frameworks and libraries which allow data scientists to design systems that sift through tonnes of data and learn from it. But a major challenge for all tools lies in tackling two primary issues:

  1. The size of the data
  2. The speed of computation

Now, with petabytes and exabytes of data, it’s become way more taxing for researchers to handle. Take image processing for example. ImageNet itself is such a massive dataset consisting of trillions of images from several distinct classes that tackling this scale is a serious lip-biting affair. The speed at which researchers are able to get actionable insights from the data is also an important factor.

Powerful hardware like multi-core GPUs, rumbling with raw power and begging to be tamed, have waltzed into the mosh pit of big data. You may try to humble these mean machines with old school machine learning stacks like R, SciPy or NumPy, but in vain. So, the deep learning community developed several powerful libraries to solve this problem, and they succeeded to an extent. But two major problems still existed – the frameworks failed to solve the problems of efficiency and flexibility together. This is where a one-of-a-kind, powerful, and flexible library like MXNet rises up to the challenge and makes developers’ lives a lot easier.

What is MXNet?

MXNet sits happy at over 10k stars on Github and has recently been inducted into the Apache Software Foundation. It focuses on accelerating the development and deployment of Deep Neural Networks at scale. This means exploiting the full potential of multi-core GPUs to process tonnes of data at blazing fast speeds. We’ll take a look at some of MXNet’s most interesting features over the next few minutes.

Why is MXNET so good?


MXNet is backed by a C++ backend which allows it to be extremely fast on even a single machine. It allows for automatically parallelizing computation across devices as well as synchronising the computation when multithreading is introduced. Moreover, the support for linear scaling means that not only the number of GPUs can be scaled, but MXNet also supports heavily distributed computing by scaling the number of machines as well.

Moreover, MXNet has a graph optimisation layer that sits on top of a dynamic dependency scheduler, which enhances memory efficiency and speed.

Extremely portable

MXNet is extremely portable, especially as it can be programmed in umpteen languages such as C++, R, Python, Julia, JavaScript, Go, and more. It’s widely supported across most operating systems like Linux, Windows, iOS as well as Android making it multi-platform, including low level platforms. Moreover, it works well in cloud environments, like AWS – one of the reasons AWS has officially adopted MXNet as its deep learning framework of choice. You can now run MXNet from the Deep Learning AMI.

Great for data visualization

MXNet uses not only the mx.viz.plot_network method for visualising neural networks but it also has in-built support for Graphviz, to visualise neural nets as a computation graph.

Check Joseph Paul Cohen’s blog for a great side-by-side visualisation of CNN architectures in MXNet. Alternatively, you could strip the TensorBoard off TensorFlow and use it with MXNet. Jump here for more info on that.


MXNet supports both imperative and declarative/symbolic styles of programming, allowing you to blend both styles for increased efficiency. Libraries like Numpy and Torch support plain imperative programming, while TensorFlow, Theano, and Caffe support plain declarative programming. You can get a closer look at what these styles actually are here.

MXNet is the only framework so far that mixes both styles to maximise efficiency and productivity.

In-built profiler

MXNet comes packaged with an in-built profiler that lets you profile execution times, layer-by-layer, in the network. Now, while you’ll be interested in using your own general profiling tools like gprof and nvprof to profile at kernel, function, or instruction level, the in-built profiler is specifically tuned to provide detailed information at a symbol or operator level.


While MXNet has a host of attractive features that explain why it has earned public admiration, it has its share of limitations just like any other popular tool. One of the biggest issues encountered with MXNet is that it tends to give varied results when compile settings are modified. For example, a model would work well with cuDNN3 but wouldn’t with cuDNN4. To overcome issues like this, you might have to spend some time on forums. Moreover, it is a daunting task to write your own operators or layers in C++, to achieve efficiency. Although, with v0.9, the official documentation mentions that it has become easier. Finally, the documentation is introductory and is not organised well enough to create custom operators or to perform other advanced tasks.

So, should I use MXNet?

MXNet is the new kid on the block that supports modern deep learning models like CNNs and LSTMs. It boasts of immense speed, scalability, and flexibility to solve your deep learning problems and consumes as little as 4 Gigs of memory when running deep networks with almost a thousand layers. The core library including its dependencies can mash into a single C++ source file, which can be compiled on both Android and iOS, as well as a browser with the JavaScript extensions. But like all other libraries, it has it’s own hurdles, which aren’t that life threatening enough, to prevent one from getting the job done, and a good one at that!

Is that enough to get you excited and start using MXNet? Go get working then! And don’t forget to tell us your experiences of working with MXNet.


Please enter your comment!
Please enter your name here