Facebook releases PyTorch 1.3 with named tensors, PyTorch Mobile, 8-bit model quantization, and more

Yesterday, at the PyTorch Developer Conference, Facebook announced the release of PyTorch 1.3. This release comes with three experimental features: named tensors, 8-bit model quantization, and PyTorch Mobile. Along with these exciting features, Facebook also announced the general availability of Google Cloud TPU support and a newly launched integration with Alibaba Cloud.

Key updates in PyTorch 1.3

Named Tensors for more readable and maintainable code

Though tensors are the building blocks of modern machine learning, researchers have argued that they are “broken.” Tensors have their own share of shortcomings: they expose private dimensions, broadcast based on absolute position, and keep the type information in the documentation.

PyTorch 1.3 tries to solve this problem by introducing experimental support for named tensors, which was proposed by Sasha Rush, an Associate Professor at Cornell Tech. He has built a library called NamedTensor, which serves as a “thin-wrapper” on Torch tensor.

This update introduces a few changes to the API. Dimension access and reduction now use a ‘dim’ argument instead of an index. Constructing and adding dimensions requires a “name” argument. Functions now broadcast based on set operations, not through heuristic ordering rules.

8-bit model quantization for mobile-optimized AI

Quantization in deep learning is the method of approximating a neural network that uses 32-bit floating-point numbers by a neural network that uses a lower-precision numerical format. It is used to reduce the bandwidth and compute requirements of deep learning models. This is extremely essential for on-device applications that have limited memory size and number of computations.

PyTorch 1.3 brings experimental support for 8-bit model quantization with the eager mode Python API for efficient deployment on servers and edge devices. This feature includes techniques like post-training quantization, dynamic quantization, and quantization-aware training. Moving from 32-bits to 8-bits can result in two to four times faster computations with one-quarter the memory usage.

PyTorch Mobile for more efficient on-device machine learning

Running machine learning models directly on edge devices is of great importance as it reduces latency. This is why PyTorch 1.3 introduces PyTorch Mobile that enables “an end-to-end workflow from Python to deployment on iOS and Android.”

The current release is experimental. In the future releases, we can expect PyTorch Mobile to come with build-level optimization, selective compilation, support for QNNPACK quantized kernel libraries and ARM CPUs, further performance improvements, and more.

Model interpretability and privacy tools in PyTorch 1.3

Captum and Captum Insights

Captum is an easy-to-use model interpretability library for PyTorch. It is backed by state-of-the-art interpretability algorithms such as Integrated Gradients, DeepLIFT, and Conductance to help developers improve and troubleshoot their models. Developers can identify different features that contribute to a model’s output and improve its design.

Facebook has also released an early release of Captum Insights. It is an interpretability visualization widget built on top of Captum. It works across images, text, and other features to help users understand feature attribution.

Check out Facebook’s announcement to know more about Captum.

CrypTen

Machine learning via cloud-based platforms poses various security and privacy challenges. Facebook writes, “In particular, users of these platforms may not want or be able to share unencrypted data, which prevents them from taking full advantage of ML tools.” PyTorch 1.3 comes with CrypTen, a framework for privacy-preserving machine learning. It aims to make secure computing techniques accessible to machine learning practitioners.

You can find more about CrypTen on GitHub.

Libraries for multimodal AI systems

Detectron2: It is an object detection library implemented in PyTorch. It features support for the latest models and tasks and increased flexibility to aid computer vision research. There are also improvements in maintainability and scalability to support production use cases.

Fairseq gets speech extensions: With this release, Fairseq, a framework for sequence-to-sequence applications such as language translation includes support for end-to-end learning for speech and audio recognition tasks.

The release of PyTorch 1.3 started a discussion on Hacker News and naturally many developers compared it with TensorFlow 2.0.

Here’s what a user commented, “This is a common trend for being second in the market when we see Pytorch and TensorFlow 2.0, TF 2.0 was created to compete directly with Pytorch pythonic implementation (Keras based, Eager execution).”

They further added, “Facebook at least on PyTorch has been delivering a quality product. Although for us running production pipelines TF is still ahead in many areas (GPU, TPU implementation, TensorRT, TFX and other pipeline tools) I can see Pytorch catching up on the next couple of years which by my prediction many companies will be running serious and advanced workflows and we may be able to see a winner there.”

The named tensors implementation is being well-received by the PyTorch community:

https://twitter.com/leopd/status/1182342855886376965

https://twitter.com/rasbt/status/1182647527906140161

These were some of the updates in PyTorch 1.3. Check out the official announcement by Facebook to know more.

PyTorch 1.2 is here with a new TorchScript API, expanded ONNX export, and more

PyTorch announces the availability of PyTorch Hub for improving machine learning research reproducibility

Sherin Thomas explains how to build a pipeline in PyTorch for deep learning workflows

Facebook AI open-sources PyTorch-BigGraph for faster embeddings in large graphs

Facebook open-sources PyText, a PyTorch based NLP modeling framework