Data

OpenAI’s new versatile AI model, GPT-2 can efficiently write convincing fake news from just a few words

3 min read

OpenAI researchers demonstrated a new AI model, yesterday, called GPT-2, that is capable of generating coherent paragraphs of text without needing any task-specific training. In other words, give it the first line of a story, and it’ll form the rest. Apart from generating articles, it can also perform rudimentary reading comprehension, summarization, machine translation, and question answering.  

GPT-2 is an unsupervised language model comprising 1.5 billion parameters and is trained on a dataset of 8 million web pages. “GPT-2 is simply trained to predict the next word in a 40GB of internet tex”, says the OpenAI team. The OpenAI team states that it is superior to other language models trained on specific domains (like Wikipedia, news, or books) as it doesn’t need to use these domain-specific training datasets.

For languages related tasks such as question answering, reading comprehension, and summarization, GPT-2 can learn these tasks directly from the raw text and doesn’t require any training data. The OpenAI team states that the GPT-2 model is ‘chameleon-like’ and easily adapts to the style and content of the input text.

However, the team has observed certain failures in the model such as repetitive text, world modeling failures, and unnatural topic switching. Finding a good sample depends on the familiarity of the model with that sample’s context. For instance, when the model is prompted with topics that are ‘highly represented in data’ like Miley Cyrus, Lord of the rings, etc, it is able to generate reasonable samples 50% of the time. On the other hand, the model performs poorly in case of highly technical or complex content.

The OpenAI team has specified that it envisions the use of GPT-2 in development of AI writing assistants, advanced dialogue agents, unsupervised translation between languages and enhanced speech recognition systems. It has also specified the potential misuses of GPT-2 as it can be used to generate misleading news articles, and automate the large scale production of fake and phishing content on social media.

Due to the concerns related to this misuse of language generating models, OpenAI has decided to release a ‘small’ version of GPT-2  with its sampling code and research paper for researchers to experiment with. The dataset, training code, or GPT-2 model weights have been excluded from the release.

The OpenAI team states that this release strategy will give them and the overall AI community the time to discuss more deeply about the implications of such systems. It also wants the government to take initiatives to monitor the societal impact of AI technologies and to track the progress of capabilities in these systems. “If pursued, these efforts could yield a better evidence base for decisions by AI labs and governments regarding publication decisions and AI policy more broadly”, states the OpenAI team.

Public reaction to the news is positive, however, not everyone is okay with OpenAI’s release strategy, and feels that the move signals towards ‘closed AI’ and propagates the ‘fear of AI’:

For more information, check out the official OpenAI GPT-2 blog post.

Read Next

OpenAI charter puts safety, standards, and transparency first

OpenAI launches Spinning Up, learning resource for potential deep learning practitioners

OpenAI builds reinforcement learning based system giving robots human like dexterity

Natasha Mathur

Tech writer at the Packt Hub. Dreamer, book nerd, lover of scented candles, karaoke, and Gilmore Girls.

Share
Published by
Natasha Mathur

Recent Posts

Top life hacks for prepping for your IT certification exam

I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…

3 years ago

Learn Transformers for Natural Language Processing with Denis Rothman

Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…

3 years ago

Learning Essential Linux Commands for Navigating the Shell Effectively

Once we learn how to deploy an Ubuntu server, how to manage users, and how…

3 years ago

Clean Coding in Python with Mariano Anaya

Key-takeaways:   Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…

3 years ago

Exploring Forms in Angular – types, benefits and differences   

While developing a web application, or setting dynamic pages and meta tags we need to deal with…

3 years ago

Gain Practical Expertise with the Latest Edition of Software Architecture with C# 9 and .NET 5

Software architecture is one of the most discussed topics in the software industry today, and…

3 years ago