7 min read

It takes time and effort to figure out what is fake news and what isn’t. Like children, we have to work our way through something we perceive as fake news.

This article is an excerpt from the book Transformers for Natural Language Processing by Denis Rothman – A comprehensive guide for deep learning & NLP practitioners, data analysts and data scientists who want an introduction to AI language understanding to process the increasing amounts of language-driven functions. 

In this article, we will focus on the logic of fake news. We will run the BERT model on SRL and visualize the results on AllenNLP.org. Now, let’s go through some presidential tweets on COVID-19. 

Our goal is certainly not to judge anybody or anything. Fake news involves both opinion and facts. News often depends on the perception of facts by local culture. We will provide ideas and tools to help others gather more information on a topic and find their way in the jungle of information we receive every day.

Semantic Role Labeling (SRL)  

SRL is an excellent educational tool for all of us. We tend just to read Tweets passively and listen to what others say about them. Breaking messages down with SRL is a good way to develop social media analytical skills to distinguish fake from accurate information.  

I recommend using SRL transformers for educational purposes in class. A young student can enter a Tweet and analyze each verb and its arguments. It could help younger generations become active readers on social media.

We will first analyze a relatively undivided Tweet and then a conflictual Tweet.

Analyzing the undivided Tweet 

Let’s analyze the latest Tweet found on July 4 while writing the book, Transformers for Natural Language Processing. I took the name of the person who is referred to as a “Black American” out and paraphrased some of the former President’s text:  

“X is a great American, is hospitalized with coronavirus, and has requested prayer. Would you join me in praying for him today, as well as all those who are suffering from COVID-19?”   

Let’s go to AllenNLP.org, visualize our SRL using https://demo.allennlp.org/semantic-role-labeling, run the sentence, and look at the result. The verb “hospitalized” shows the member is staying close to the facts:  

Figure: SRL arguments of the verb “hospitalized”  

The message is simple: “X” + “hospitalized” + “coronavirus.”  

The verb “requested” shows that the message is becoming political: 

 Figure: SRL arguments of the verb “requested”  

We don’t know if the person requested the former President to pray or he decided he would be the center of the request.  

A good exercise would be to display an HTML page and ask the users what they think. For example, the users could be asked to look at the results of the SRL task and answer the two following questions:  

“Was former President Trump asked to pray, or did he deviate a request made to others for political reasons?”  

“Is the fact that former President Trump states that he was indirectly asked to pray for X fake news or not?” 

You can think about it and decide for yourself!  

Analyzing the Banned Tweet

Let’s have a look at one that was banned from Twitter. I took the names out and paraphrased it and toned it down. Still, when we run it on AllenNLP.org and visualize the results, we get some surprising SRL outputs.  

Here is the toned-down and paraphrased Tweet:  

These thugs are dishonoring the memory of X.  

When the looting starts, actions must be taken.  

Although I suppressed the main part of the original Tweet, we can see that the SRL task shows the bad associations made in the Tweet:  

Figure: SRL arguments of the verb “dishonoring”  

An educational approach to this would be to explain that we should not associate the arguments “thugs” and “memory” and “looting.” They do not fit together at all.  

An important exercise would be to ask a user why the SRL arguments do not fit together.  

I recommend many such exercises so that the transformer model users develop SRL skills to have a critical view of any topic presented to them.  

Critical thinking is the best way to stop the propagation of the fake news pandemic!  

We have gone through rational approaches to fake news with transformers, heuristics, and instructive websites. However, in the end, a lot of the heat in fake news debates boils down to emotional and irrational reactions.  

In a world of opinion, you will never find an entirely objective transformer model that detects fake news since opposing sides never agree on what the truth is in the first place! One side will agree with the transformer model’s output. Another will say that the model is biased and built by enemies of their opinion!  

The best approach is to listen to others and try to keep the heat down!      

Looking for the silver bullet  

Looking for a silver bullet transformer model can be time-consuming or rewarding, depending on how much time and money you want to spend on continually changing models.  

For example, a new approach to transformers can be found through disentanglement. Disentanglement in AI allows you to separate the features of a representation to make the training process more flexible. Pengcheng He, Xiaodong Liu, Jianfeng Gao, and Weizhu Chen designed DeBERTa, a disentangled version of a transformer, and described the model in an interesting article:  

DeBERTa: Decoding-enhanced BERT with Disentangled Attention, https://arxiv.org/ abs/2006.03654

The two main ideas implemented in DeBERTa are:  

  • Disentangle the content and position in the transformer model to train the two vectors separately. 
  • Use an absolute position in thedecoderto predict masked tokens in the pretraining process.  

The authors provide the code on GitHub: https://github.com/microsoft/DeBERTa

DeBERTa exceeds the human baseline on the SuperGLUE leaderboard in December 2020 using 1.5B parameters.  

Should you stop everything you are doing on transformers and rush to this model, integrate your data, train the model, test it, and implement it?  

It is very probable that by the end of 2021, another model will beat this one and so on. Should you change models all of the time in production? That will be your decision.  

You can also choose to design better training methods.  

Looking for reliable training methods  

Looking for reliable training methods with smaller models such as the PET designed by Timo Schick can also be a solution.  

Why? Being in a good position on the SuperGLUE leaderboard does not mean that the model will provide a high quality of decision-making for medical, legal, and other critical areas for sequence predications.  

Looking for customized training solutions for a specific topic could be more productive than trying all the best transformers on the SuperGLUE leaderboard.  

Take your time to think about implementing transformers to find the best approach for your project.  

We will now conclude the article 


Fake news begins deep inside our emotional history as humans. When an event occurs, emotions take over to help us react quickly to a situation. We are hardwired to react strongly when we are threatened.  

We went through raging conflicts over COVID-19, former President Trump, and climate change. In each case, we saw that emotional reactions are the fastest ones to build up into conflicts.  

We then designed a roadmap to take the emotional perception of fake news to a rational level. We showed that it is possible to find key information in Tweets, Facebook messages, and other media. The news used in this article is perceived by some as real news and others as fake news to create a rationale for teachers, parents, friends, co-workers, or just people talking. 

About the Author

Denis Rothman graduated from Sorbonne University and Paris-Diderot University, patenting one of the very first word2matrix embedding solutions. Denis Rothman is the author of three cutting-edge AI solutions: one of the first AI cognitive chatbots more than 30 years ago; a profit-orientated AI resource optimizing system; and an AI APS (Advanced Planning and Scheduling) solution based on cognitive patterns used worldwide in aerospace, rail, energy, apparel, and many other fields. Designed initially as a cognitive AI bot for IBM, it then went on to become a robust APS solution used to this day. 

VIAExpert Insight
SOURCETransformers for Natural Language Processing
Expert Insight presents a new line of books from Packt Publishing that will sit alongside and complement the wider publishing program. We started publishing titles in this line in late 2017, and are adding new titles every month. The books of Expert Insight feature leading authors in their respective fields, who have significant experience and a profound knowledge of trending topics. We aim to provide a platform for expert insight and opinions, both from individual authors and from groups of experts working in the same field. Our titles are very much focused around the voice and identity of the author themselves and the opportunity for the reader to connect with the individual, supported by the author’s image on our covers.