NVIDIA leads the AI hardware race. But which of its GPUs should you use for deep learning?

For readers who are new to deep learning and who might be wondering what a GPU is, let’s start there. To make it simple, consider deep learning as nothing more than a set of calculations - complex calculations, yes, but calculations nonetheless. To run these calculations, you need hardware. Ordinarily, you might just use a normal processor like the CPU inside your laptop. However, this isn’t powerful enough to process at the speed at which deep learning computations need to happen. GPUs, however, can. This is because while a conventional CPU has only a few complex cores, a GPU can have thousands of simple cores. With a GPU, training a deep learning data set can take just hours instead of days.

However, although it’s clear that GPUs have significant advantages over CPUs, there is nevertheless a range of GPUs available, each having their own individual differences. Selecting one is ultimately a matter of knowing what your needs are. Let’s dig deeper and find out how to go about shopping for GPUs…

What to look for before choosing a GPU?

There are a few specifications to consider before picking a GPU.

Memory bandwidth: This determines the capacity of a GPU to handle large amounts of data. It is the most important performance metric, as with faster memory bandwidth more data can be processed at higher speeds.

Number of cores: This indicates how fast a GPU can process data. A large number of CUDA cores can handle large datasets well. CUDA cores are parallel processors similar to cores in a CPU but their number is in thousands and are not suited for complex calculations that a CPU core can perform.

Memory size: For computer vision projects, it is crucial for memory size to be as much as you can afford. But with natural language processing, memory size does not play such an important role.

Our pick of GPU devices to choose from

The go to choice here is NVIDIA; they have standard libraries that make it simple to set things up. Other graphics cards are not very friendly in terms of the libraries supported for deep learning. NVIDIA CUDA Deep Neural Network library also has a good development community.

“Is NVIDIA Unstoppable In AI?” -Forbes

“Nvidia beats forecasts as sales of graphics chips for AI keep booming” -SiliconANGLE

AMD GPUs are powerful too but lack library support to get things running smoothly. It would be really nice to see some AMD libraries being developed to break the monopoly and give more options to the consumers.

NVIDIA RTX 2080 Ti: The RTX line of GPUs are to be released in September 2018. The RTX 2080 Ti will be twice as fast as the 1080 Ti. Price listed on NVIDIA website for founder’s edition is $1,199.
- RAM: 11 GB
- Memory bandwidth: 616 GBs/second
- Cores: 4352 cores @ 1545 MHz

NVIDIA RTX 2080: This is more cost efficient than the 2080 Ti at a listed price of $799 on NVIDIA website for the founder’s edition.
- RAM: 8 GB
- Memory bandwidth: 448 GBs/second
- Cores: 2944 cores @ 1710 MHz

NVIDIA RTX 2070: This is more cost efficient than the 2080 Ti at a listed price of $599 on NVIDIA website. Note that the other versions of the RTX cards will likely be cheaper than the founder’s edition around a $100 difference.
- RAM: 8 GB
- Memory bandwidth: 448 GBs/second
- Cores: 2304 cores @ 1620 MHz

NVIDIA GTX 1080 Ti: Priced at $650 on Amazon. This is a higher end option but offers great value for money, and can also do well in Kaggle competitions. If you need more memory but cannot afford the RTX 2080 Ti go for this.
- RAM: 11 GB
- Memory bandwidth: 484 GBs/second
- Cores: 3584 cores @ 1582 MHz

NVIDIA GTX 1080: Priced at $584 on Amazon. This is a mid-high end option only slightly behind the 1080Ti.
- VRAM: 8 GB
- Memory bandwidth: 320 GBs/second
- Processing power: 2560 cores @ 1733 MHz

NVIDIA GTX 1070 Ti: Priced at around $450 on Amazon. This is slightly less performant than the GTX 1080 but $100 cheaper.
- VRAM: 8 GB
- Memory bandwidth: 256 GBs/second
- Processing power: 2438 cores @ 1683 MHz

NVIDIA GTX 1070: Priced at $380 on Amazon is currently the bestseller because of crypto miners. Somewhat slower than the 1080 GPUs but cheaper.
- VRAM: 8 GB
- Memory bandwidth: 256 GBs/second
- Processing power: 1920 cores @ 1683 MHz

NVIDIA GTX 1060 6GB: Priced at around $290 on Amazon. Pretty cheap but the 6 GB VRAM limits you. Should be good for NLP but you’ll find the performance lacking in computer vision.
- VRAM: 6 GB
- Memory bandwidth: 216 GBs/second
- Processing power: 1280 cores @ 1708 MHz

NVIDIA GTX 1050 Ti: Priced at around $200 on Amazon. This is the cheapest workable option. Good to get started with deep learning and explore if you’re new.
- VRAM: 4 GB
- Memory bandwidth: 112 GBs/second
- Processing power: 768 cores @ 1392 MHz

NVIDIA Titan XP: The Titan XP is also an option but gives only a marginally better performance while being almost twice as expensive as the GTX 1080 Ti, it has 12 GB memory, 547.7 GB/s bandwidth and 3840 cores @ 1582 MHz. On a side note, NVIDIA Quadro GPUs are pretty expensive and don’t really help in deep learning they are more of use in CAD and working with heavy graphics production tasks.

The graph below does a pretty good job of visualizing how all the GPUs above compare:

nvidia-leads-the-ai-hardware-race-but-which-of-its-gpus-should-you-use-for-deep-learning-img-0

Source: Slav Ivanov Blog, processing power is calculated as CUDA cores times the clock frequency

Does the number of GPUs matter?

Yes, it does. But how many do you really need? What’s going to suit the scale of your project without breaking your budget? 2 GPUs will always yield better results than just one - but it’s only really worth it if you need the extra power. There are two options you can take with multi-GPU deep learning. On the one hand, you can train several different models at once across your GPUs, or, alternatively distribute one single training model across multiple GPUs known as “multi-GPU training”. The latter approach is compatible with TensorFlow, CNTK, and PyTorch.

Both of these approaches have advantages. Ultimately, it depends on how many projects you’re working on and, again, what your needs are.

Another important point to bear in mind is that if you’re using multiple GPUs, the processor and hard disk need to be fast enough to feed data continuously - otherwise the multi-GPU approach is pointless.

nvidia-leads-the-ai-hardware-race-but-which-of-its-gpus-should-you-use-for-deep-learning-img-1

Source: NVIDIA website

It boils down to your needs and budget, GPUs aren’t exactly cheap.

Other heavy devices

There are also other large machines apart from GPUs. These include the specialized supercomputer from NVIDIA, the DGX-2, and Tensor processing units (TPUs) from Google.

The NVIDIA DGX-2

If you thought GPUs were expensive, let me introduce you to NVIDIA DGX-2, the successor to the NVIDIA DGX-1. It’s a highly specialized workstation; consider it a supercomputer that has been specially designed to tackle deep learning.

The price of the DGX-2 is (*gasp*) $399,000.

Wait, what? I could buy some new hot wheels for that, or Dual Intel Xeon Platinum 8168, 2.7 GHz, 24-cores, 16 NVIDIA GPUs, 1.5 terabytes of RAM, and nearly 32 terabytes of SSD storage! The performance here is 2 petaFLOPS.

Let’s be real: many of us probably won’t be able to afford it. However, NVIDIA does have leasing options, should you choose to try it. Practically speaking, this kind of beast finds its use in research work. In fact, the first DGX-1 was gifted to OpenAI by NVIDIA to promote AI research.

Visit the NVIDIA website for more on these monster machines. There are also personal solutions available like the NVIDIA DGX Workstation.

TPUs

Now that you’ve caught your breath after reading about AI dream machines, let’s look at TPUs. Unlike the DGX machines, TPUs run on the cloud.

A TPU is what’s referred to as an application-specific integrated circuit (ASIC) that has been designed specifically for machine learning and deep learning by Google. Here’s the key stats: Cloud TPUs can provide up to 11.5 petaflops of performance in a single pod.

If you want to learn more, go to Google’s website.

When choosing GPUs you need to weigh up your options

The GTX 1080 Ti is most commonly used by researchers and competitively for Kaggle, as it gives good value for money. Go for this if you are sure about what you want to do with deep learning. The GTX 1080 and GTX 1070 Ti are cheaper with less computing power, a more budget friendly option if you cannot afford the 1080 Ti.

GTX 1070 saves you some more money but is slower. The GTX 1060 6GB and GTX 1050 Ti are good if you’re just starting off in the world of deep learning without burning a hole in your pockets.

If you must have the absolute best GPU irrespective of the cost then the RTX 2080 Ti is your choice. It offers twice the performance for almost twice the cost of a 1080 Ti.

Nvidia unveils a new Turing architecture: “The world’s first ray tracing GPU”

Nvidia GPUs offer Kubernetes for accelerated deployments of Artificial Intelligence workloads

Nvidia’s Volta Tensor Core GPU hits performance milestones. But is it the best?