3 min read

Last week, the team at Facebook AI Research announced that they are open sourcing  PyText NLP framework. PyText, a deep-learning based NLP modeling framework, is built on PyTorch. Facebook is outsourcing some of the conversational AI techs for powering the Portal video chat display and M suggestions on Facebook Messenger.

How is PyText useful for Facebook

The PyText framework is used for tasks like document classification, semantic parsing, sequence tagging and multitask modeling. This framework easily fits into research and production workflows and emphasizes on robustness and low-latency to meet Facebook’s real-time NLP needs.

PyText is also responsible for models powering more than a billion daily predictions at Facebook. This framework addresses the conflicting requirements of enabling rapid experimentation and serving models at scale by providing simple interfaces and abstractions for model components. It uses PyTorch’s capabilities of exporting models for inference through optimized Caffe2 execution engine.

Features of PyText

  • PyText features production-ready models for various NLP/NLU tasks such as text classifiers, sequence taggers, etc.
  • PyText comes with a distributed-training support, built on the new C10d backend in PyTorch 1.0.
  • It comes with training support and also features extensible components that help in creating new models and tasks.
  • The framework’s modularity, allows it to create new pipelines from scratch and modify the existing workflows.
  • It comes with a simplified workflow for faster experimentation.
  • It gives an access to a rich set of prebuilt model architectures for text processing and vocabulary management.
  • Serve as an end-to-end platform for developers.
  • Its modular structure helps engineers to incorporate individual components into existing systems.
  • Added support for string tensors to work efficiently with text in both training and inference.

PyText for NLP development

PyText improves the workflow for NLP and supports distributed training for speeding up NLP experiments that require multiple runs.

Easily portable

The PyText models can be easily shared across different organizations in the AI community.

Prebuilt models

With a model focused on NLP tasks, such as text classification, word tagging, semantic parsing, and language modeling, this framework makes it possible to use pre-built models on new data, easily.

Contextual models

For improving the conversational understanding in various NLP tasks, PyText uses the contextual information, such as an earlier part of a conversation thread. There are two contextual models in PyText, a SeqNN model for intent labeling tasks and a Contextual Intent Slot model for joint training on both tasks.

PyText exports models to Caffe2

PyText uses PyTorch 1.0’s capability to export models for inference through the optimized Caffe2 execution engine. Native PyTorch models require Python runtime, which is not scalable because of the multithreading limitations of Python’s Global Interpreter Lock. Exporting to Caffe2 provides efficient multithreaded C++ backend for serving huge volumes of traffic efficiently.

PyText’s capabilities to test new state-of-the-art models will be improved further in the next release. Since, putting sophisticated NLP models on mobile devices is a big challenge, the team at Facebook AI research will work towards building an end-to-end workflow for on-device models.

The team plans to include supporting multilingual modeling and other modeling capabilities. They also plan to make models easier to debug, and might also add further optimizations for distributed training.

“PyText has been a collaborative effort across Facebook AI, including researchers and engineers focused on NLP and conversational AI, and we look forward to working together to enhance its capabilities,” said the Facebook AI research team.

Users are excited about this news and want to explore more.

To know about this in detail, check out the release notes on GitHub.

Read Next

Facebook contributes to MLPerf and open sources Mask R-CNN2Go, its CV framework for embedded and mobile devices

Facebook retires its open source contribution to Nuclide, Atom IDE, and other associated repos

Australia’s ACCC publishes a preliminary report recommending Google Facebook be regulated and monitored for discriminatory and anti-competitive behavior