Last week, the team at Facebook AI Research announced that they are open sourcing PyText NLP framework. PyText, a deep-learning based NLP modeling framework, is built on PyTorch. Facebook is outsourcing some of the conversational AI techs for powering the Portal video chat display and M suggestions on Facebook Messenger.
We are open-sourcing PyText, a framework for natural language processing. PyText is built on #PyTorch and makes it faster and easier to build deep learning models for NLP.https://t.co/FaUaV7qsPi pic.twitter.com/LRbM6iVxRw
— Facebook Engineering (@fb_engineering) December 14, 2018
How is PyText useful for Facebook
The PyText framework is used for tasks like document classification, semantic parsing, sequence tagging and multitask modeling. This framework easily fits into research and production workflows and emphasizes on robustness and low-latency to meet Facebook’s real-time NLP needs.
PyText is also responsible for models powering more than a billion daily predictions at Facebook. This framework addresses the conflicting requirements of enabling rapid experimentation and serving models at scale by providing simple interfaces and abstractions for model components. It uses PyTorch’s capabilities of exporting models for inference through optimized Caffe2 execution engine.
Features of PyText
- PyText features production-ready models for various NLP/NLU tasks such as text classifiers, sequence taggers, etc.
- PyText comes with a distributed-training support, built on the new C10d backend in PyTorch 1.0.
- It comes with training support and also features extensible components that help in creating new models and tasks.
- The framework’s modularity, allows it to create new pipelines from scratch and modify the existing workflows.
- It comes with a simplified workflow for faster experimentation.
- It gives an access to a rich set of prebuilt model architectures for text processing and vocabulary management.
- Serve as an end-to-end platform for developers.
- Its modular structure helps engineers to incorporate individual components into existing systems.
- Added support for string tensors to work efficiently with text in both training and inference.
PyText for NLP development
PyText improves the workflow for NLP and supports distributed training for speeding up NLP experiments that require multiple runs.
The PyText models can be easily shared across different organizations in the AI community.
With a model focused on NLP tasks, such as text classification, word tagging, semantic parsing, and language modeling, this framework makes it possible to use pre-built models on new data, easily.
For improving the conversational understanding in various NLP tasks, PyText uses the contextual information, such as an earlier part of a conversation thread. There are two contextual models in PyText, a SeqNN model for intent labeling tasks and a Contextual Intent Slot model for joint training on both tasks.
PyText exports models to Caffe2
PyText uses PyTorch 1.0’s capability to export models for inference through the optimized Caffe2 execution engine. Native PyTorch models require Python runtime, which is not scalable because of the multithreading limitations of Python’s Global Interpreter Lock. Exporting to Caffe2 provides efficient multithreaded C++ backend for serving huge volumes of traffic efficiently.
PyText’s capabilities to test new state-of-the-art models will be improved further in the next release. Since, putting sophisticated NLP models on mobile devices is a big challenge, the team at Facebook AI research will work towards building an end-to-end workflow for on-device models.
The team plans to include supporting multilingual modeling and other modeling capabilities. They also plan to make models easier to debug, and might also add further optimizations for distributed training.
“PyText has been a collaborative effort across Facebook AI, including researchers and engineers focused on NLP and conversational AI, and we look forward to working together to enhance its capabilities,” said the Facebook AI research team.
Users are excited about this news and want to explore more.
This is cool, excited to work with it!👍
— Muhammad Irfan Kurniawan (@ezylryb_) December 15, 2018
PyText from FAIR for taking #NLProc solutions from idea to production using @PyTorch. The area of OSS NLP frameworks (esp. around PyTorch) is so rich right now. What an exciting time to do NLP research! https://t.co/dYZczRA1uZ pic.twitter.com/YFby9s6siy
— Delip Rao (@deliprao) December 14, 2018
To know about this in detail, check out the release notes on GitHub.