Key takeaways
- The transformer architecture has proved to be revolutionary in outperforming the classical RNN and CNN models in use today.
- Artificial intelligence is simply a recent form of automation, just like all other automation.
- AI consultants will always be necessary to implement AI.
- Understand transformers from a cognitive science perspective with the book Transformers for Natural Language Processing.
The transformer architecture is both revolutionary and disruptive making it the hottest Algorithm in AI. It is a game-changer for Natural Language Understanding (NLU), a subset of Natural Language Processing (NLP), which has become one of the pillars of artificial intelligence in a global digital economy. Transformers can outperform the classical RNN and CNN models in use today.
We interviewed artificial intelligence expert Denis Rothman about transformers, it’s advancement in artificial intelligence & NLP, and his recent book Transformers for Natural Language Processing.
What’s the significance of AI language understanding in the tech world today and what role do transformers play in it?
Artificial intelligence-driven language understanding is expanding exponentially. It has become the pillar of language modeling, chatbots, personal assistants, question answering, text summarizing, speech-to-text, sentiment analysis, machine translation, and more.
The Transformer, introduced by Google, provides novel approaches to language understanding through a novel self-attention architecture. OpenAI offers transformer technology, and Facebook’s AI Research department provides high-quality datasets. Overall, the Internet giants have made transformers available to all, as you will discover in my book.
The transformer architecture is both revolutionary and disruptive. The Transformer and subsequent transformer architectures and models are revolutionary because they changed the way we think of NLP and artificial intelligence itself. The architecture of the Transformer is not an evolution. It breaks with the past, leaving RNNs and CNNs behind. It takes us closer to seamless machine intelligence that will match human intelligence in the years to come.
What should deep learning & NLP practitioners keep in mind while starting their career with transformers?
The world of artificial intelligence is undergoing an exponential evolution in NLP due to the amount of data available. As this evolution expands to all domains, new abilities are required. NLP will not just be about downloading a model and getting to work in terms of software. You will have to analyze the quality of what a transformer model produces to fine-tune it.
In turn, to analyze NLP properly, a minimum knowledge in linguistics will become mandatory. Linguistics will enable you to understand the building blocks and structure of a language. Grammar will increase your ability to analyze the output of a transformer. Otherwise, your team will have to hire a linguist, which will increase the project’s cost and threaten the Return On Investment(ROI) of the team.
What are some future advancements that you anticipate in transformers and NLP?
Transformers have wiped RNNs off the map at this point. They represent the industrialization of artificial intelligence. As artificial intelligence, transformers are taking AI from the hype to an industrial level. Unlike traditional deep learning models, transformers contain optimized layers for GPUs and CPUs.
In the future, creating NLP models will require machine architecture awareness. Machine performance will be the key to more efficient models. Not everybody can purchase or rent a supercomputer to train a model. Learning how to design tailored transformer models based on optimized datasets will become mandatory to face competition.
What are some of the popular myths around transformers prevalent in the tech market?
Many people believe that transformers can perform all NLP tasks with a model such as GPT-3. Nothing can be further from the truth. Google, Microsoft, Facebook, and Amazon, for example, need data for their everyday business and powerful NLP transformer models to analyze the billions of words coming in every day. However, the tasks are limited to their marketing usage.
If you need to implement a transformer in a specific area, you will have to build datasets. You will also have to build pipelines with classical algorithms and queries to process the data, the inputs, and manage the outputs. In real-life, that means that artificial intelligence is only a component in a long chain of classical algorithms and processes.
How was your experience building one of the very first word2matrix embedding solutions?
In the early 1980s, I managed a company with many students who wanted to learn a language. I had a choice. Increase the number of teachers or automate vast portions of the process. I decided to go for automation. Any intelligent system requires calculations. I found that converting words and word pieces into numbers was far more efficient than directly analyzing the words.
I thus create a word2vector system, patented it in 1982, wrote a textbook, and implemented it in our company. Students began to take specific courses independently in our lab without a teacher. I then went further in the next few years, writing one of the first Cognitive NLP Chatbots with was successfully implemented for an industrial amount of students.
Being the author of three cutting-edge AI solutions, what is your take on the shrinkage of job opportunities due to AI?
Automation began centuries ago with water mills, windmills, textile machines, locomotives, and more recently, motorized personal vehicles in the early 20th century. Tractors replaced millions of jobs in the fields.
Services are no exception. In the 1950s, hundreds of thousands of tellers, actual humans, worked in banks around the world. Today everybody goes to an ATM. ATM stands for Automated Teller Machine(ATM). “Automated teller,” says it all. A person performing a service was automated.
Software is the automation of human tasks from the beginning, from accounting to stock market management and thousands of tasks.
Artificial intelligence is simply a recent form of automation, just like all other automation. AI cannot replace traditional mathematics in physics. The calculation of differential equations driving rockets and satellites requires classical software precision, not artificial intelligence.
AI is only a component of automation, like when cars replaced horses and all of the jobs that went with horse-driven transportation.
AI will not replace everything because AI is useless in many fields. AI consultants will always be necessary to implement AI.
Why has Python become the most suitable language for natural language processing?
It’s important not to confuse the concepts of “most used” and “most suitable.” Python is a great intuitive language to learn AI and NLP. But it’s not a prerequisite. Python is easy to use and run, making it the shortest path, at this point, to take to learn AI.
But do not be mistaken. C++ skills will also be required in large real-life projects, for example.
My advice. Learn AI with Python at full speed. Do some implementations with Python. But learn other languages such as C++, Java, and more. Real-life pipelines require classical processes and algorithms, not only AI. In some projects, C++ will boost performances, for example.
Tell us about your book, Transformers for Natural Language Processing. What trajectory does your book follow to help its readers master transformers?
Reading my book on transformers will help you save weeks and maybe months of effort trying to understand how they work by watching videos and reading blogs.
The reader will begin by learning the original Transformer in depth. Once the transformer’s building blocks are mastered, the reader will learn how to train and fine-tune a transformer. The reader will then build and run the main transformer models such as BERT, RoBERTa, GPT-2, T5, and more. The models will be applied to NLP tasks such as document summarization, Q&As, semantic analysis, and a wide range of NLP tasks. The book contains a method to analyze fake news with transformers.
The book also goes beyond the architecture of transformers and into the world of usage. You will learn how to build, train, fine-tune, and implement transformers.