Stable release of CUDA 10.0 out, with Turing support, tools and library changes

2 min read

CUDA 10.0 was released mid-September bringing updates to the compiler, tools, and libraries. Support has also been added for the Turing architectures compute_75 and sm_75.

Compiler changes in CUDA 10.0

The paths of some compilers have been changed. The CUDA-C and CUDA-C++ compiler—nvcc, is now located in the bin/ directory. nvcc is built on top of the NVVM optimizer, which is built on top of the LLVM compiler infrastructure. If you want to target NVVM directly use the Compiler SDK available in the nvvm/ directory.

The following files are compiler-internal and can change without any prior notice.

Any files in include/crt and bin/crt

  • Files like include/common_functions.h, include/device_double_functions.h, include/device_functions.h, include/host_config.h, include/host_defines.h, and include/math_functions.h
  • nvvm/bin/cicc
  • bin/cudafe++, bin/bin2c, and bin/fatbinary

These compilers are supported as host compilers in nvcc:

  • Clang 6.0
  • Microsoft Visual Studio 2017 (RTW, Update 8 and later)
  • Xcode 9.4
  • XLC 16.1.x
  • ICC 18
  • PGI 18.x (with -std=c++14 mode)

Note that, starting with CUDA 10.0, nvcc supports all versions of Visual Studio 2017, previous versions and newer updates.

There is a new libNVVM API function called nvvmLazyAddModuleToProgram in CUDA 10.0. This function is to be used for adding the libdevice module along with any other similar modules to a program for making it more efficient.

The –extensible-whole-program (or -ewp) option has been added to nvcc. This option can be used to do whole-program optimizations. With this option you can use cuda-device-parallelism features without having to use separate compilation.

Warp matrix functions (wmma), first introduced in PTX ISA version 6.0 are now fully supported retroactively from PTX ISA version 6.0 onwards.

Tool changes

Except for Nsight Visual Studio Edition (VSE) which is installed as a plug-in to Microsoft Visual Studio, the following tools are available in the bin/ directory ().

  • IDEs like nsight (Linux, Mac), Nsight VSE (Windows)
  • Debuggers like cuda-memcheck, cuda-gdb (Linux), Nsight VSE (Windows)
  • Profilers like nvprof, nvvp, Nsight VSE (Windows)
  • Utilities like cuobjdump, nvdisasm, gpu-library-advisor

CUDA 10.0 now includes Nsight Compute, a set of developer tools for profiling and debugging. It is supported on Windows, Linux and Mac. nvprof now supports OpenMP tools interface. NVIDIA Tools Extension API (NVTX) V3 us now supported by the profiler.

Changes are also made to the libraries nvJPEG, cuFFT, cuBLAS, NVIDIA Performance Primitives (NPP), and cuSOLVER. CUDA 10.0 has optimized libraries for Turing architecture and there is a new library called nvJPEG for GPU accelerated hybrid JPEG decoding.

For a complete list of changes, visit the NVIDIA website.

Read next

Microsoft Azure now supports NVIDIA GPU Cloud (NGC)

NVIDIA leads the AI hardware race. But which of its GPUs should you use for deep learning?

NVIDIA announces pre-orders for the Jetson Xavier Developer Kit, an AI chip for autonomous machines, at $2,499