4 min read
The official announcement states,
“PyTorch 1.0 accelerates the workflow involved in taking breakthrough research in artificial intelligence to production deployment. With deeper cloud service support from Amazon, Google, and Microsoft, and tighter integration with technology providers ARM, Intel, IBM, NVIDIA, and Qualcomm, developers can more easily take advantage of PyTorch’s ecosystem of compatible software, hardware, and developer tools. The more software and hardware that is compatible with PyTorch 1.0, the easier it will be for AI developers to quickly build, train, and deploy state-of-the-art deep learning models.”
PyTorch is an open-source Python-based deep learning framework which provides powerful GPU acceleration. PyTorch is known for advanced indexing and functions, imperative style, integration support and API simplicity. This is one of the key reasons why developers prefer PyTorch for research and hackability. On the downside, it has struggled with adoption in production environments. The pyTorch team acknowledged this in their roadmap and have worked on improving this aspect significantly in pyTorch 1.0 not just in terms of improving the library but also by enriching its ecosystems but partnering with key software and hardware vendors.
“One of its biggest downsides has been production-support. What we mean by production-support is the countless things one has to do to models to run them efficiently at massive scale:
- exporting to C++-only runtimes for use in larger projects
- optimizing mobile systems on iPhone, Android, Qualcomm and other systems
- using more efficient data layouts and performing kernel fusion to do faster inference (saving 10% of speed or memory at scale is a big win)
- quantized inference (such as 8-bit inference)”, stated the pyTorch team in their roadmap post.
Below are some key highlights of this major milestone for PyTorch.
The JIT is a set of compiler tools for bridging the gap between research in PyTorch and production. It includes a language called Torch Script ( a subset of Python), and two ways (Tracing mode and Script mode) in which the existing code can be made compatible with the JIT.
torch.distributed new “C10D” library
The torch.distributed package and torch.nn.parallel.DistributedDataParallel module are backed by the new “C10D” library. The main highlights of the new library are:
- C10D is performance driven and operates entirely asynchronously for all backends: Gloo, NCCL, and MPI.
- Significant Distributed Data Parallel performance improvements especially for slower network like ethernet-based hosts
- Adds async support for all distributed collective operations in the torch.distributed package.
- Adds send and recv support in the Gloo backend
C++ Frontend [API Unstable]
The C++ frontend is a pure C++ interface to the PyTorch backend that follows the API and architecture of the established Python frontend. It is intended to enable research in high performance, low latency and bare metal C++ applications. It provides equivalents to torch.nn, torch.optim, torch.data and other components of the Python frontend.
The C++ frontend is marked as “API Unstable” as part of PyTorch 1.0. This means it is ready to be used for building research applications, but still has some open construction sites that will stabilize over the next month or two. In other words, it is not ready for use in production, yet.
N-dimensional empty tensors, a collection of new operators inspired from numpy and scipy and new distributions such as Weibull, Negative binomial and multivariate log gamma distributions have been introduced. There have also been a lot of breaking changes, bug fixes, and other improvements made to pyTorch 1.0.