Data

Transformers 2.0: NLP library with deep interoperability between TensorFlow 2.0 and PyTorch, and 32+ pretrained models in 100+ languages

2 min read

Last week, Hugging Face, startup specializing in natural language processing, released a landmark update to their popular Transformers library, offering unprecedented compatibility between two major deep learning frameworks, PyTorch and TensorFlow 2.0.

Transformers (formerly known as pytorch-transformers and pytorch-pretrained-bert) provides general-purpose architectures (BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet…) for Natural Language Understanding (NLU) and Natural Language Generation (NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between TensorFlow 2.0 and PyTorch.

Transformers 2.0 embraces the ‘best of both worlds’, combining PyTorch’s ease of use with TensorFlow’s production-grade ecosystem. The new library makes it easier for scientists and practitioners to select different frameworks for the training, evaluation and production phases of developing the same language model.

“This is a lot deeper than what people usually think when they talk about compatibility,” said Thomas Wolf, who leads Hugging Face’s data science team. “It’s not only about being able to use the library separately in PyTorch and TensorFlow. We’re talking about being able to seamlessly move from one framework to the other dynamically during the life of the model.”

“It’s the number one feature that companies asked for since the launch of the library last year,” said Clement Delangue, CEO of Hugging Face.

Notable features in Transformers 2.0

  • 8 architectures with over 30 pretrained models, in more than 100 languages
  • Load a model and pre-process a dataset in less than 10 lines of code
  • Train a state-of-the-art language model in a single line with the tf.keras fit function
  • Share pretrained models, reducing compute costs and carbon footprint
  • Deep interoperability between TensorFlow 2.0 and PyTorch models
  • Move a single model between TF2.0/PyTorch frameworks at will
  • Seamlessly pick the right framework for training, evaluation, production
  • As powerful and concise as Keras

About Hugging Face Transformers

With half a million installs since January 2019, Transformers is the most popular open-source NLP library. More than 1,000 companies including Bing, Apple or Stitchfix are using it in production for text classification, question-answering, intent detection, text generation or conversational. Hugging Face, the creators of Transformers, have raised US$5M so far from investors in companies like Betaworks, Salesforce, Amazon and Apple.

On Hacker News, users are appreciating the company and how Transformers has become the most important library in NLP.

Other interesting news in data

Baidu open sources ERNIE 2.0, continual pre-training NLP model that outperforms BERT and XLNet on 16 NLP tasks

Dr Joshua Eckroth on performing Sentiment Analysis on social media platforms using CoreNLP

Facebook open-sources PyText, PyTorch based NLP modeling framework

Fatema Patrawala

Being a Senior Content Marketing Editor at Packt Publishing, I handle vast array of content in the tech space ranging from Data science, Web development, Programming, Cloud & Networking, IoT, Security and Game development. With prior experience and understanding of Marketing I aspire to grow leaps and bounds in the Content & Digital Marketing field. On the personal front I am an ambivert and love to read inspiring articles and books on life and in general.

Share
Published by
Fatema Patrawala

Recent Posts

Top life hacks for prepping for your IT certification exam

I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…

3 years ago

Learn Transformers for Natural Language Processing with Denis Rothman

Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…

3 years ago

Learning Essential Linux Commands for Navigating the Shell Effectively

Once we learn how to deploy an Ubuntu server, how to manage users, and how…

3 years ago

Clean Coding in Python with Mariano Anaya

Key-takeaways:   Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…

3 years ago

Exploring Forms in Angular – types, benefits and differences   

While developing a web application, or setting dynamic pages and meta tags we need to deal with…

3 years ago

Gain Practical Expertise with the Latest Edition of Software Architecture with C# 9 and .NET 5

Software architecture is one of the most discussed topics in the software industry today, and…

3 years ago