OpenAI researchers demonstrated a new AI model, yesterday, called GPT-2, that is capable of generating coherent paragraphs of text without needing any task-specific training. In other words, give it the first line of a story, and it’ll form the rest. Apart from generating articles, it can also perform rudimentary reading comprehension, summarization, machine translation, and question answering.
GPT-2 is an unsupervised language model comprising 1.5 billion parameters and is trained on a dataset of 8 million web pages. “GPT-2 is simply trained to predict the next word in a 40GB of internet tex”, says the OpenAI team. The OpenAI team states that it is superior to other language models trained on specific domains (like Wikipedia, news, or books) as it doesn’t need to use these domain-specific training datasets.
For languages related tasks such as question answering, reading comprehension, and summarization, GPT-2 can learn these tasks directly from the raw text and doesn’t require any training data. The OpenAI team states that the GPT-2 model is ‘chameleon-like’ and easily adapts to the style and content of the input text.
However, the team has observed certain failures in the model such as repetitive text, world modeling failures, and unnatural topic switching. Finding a good sample depends on the familiarity of the model with that sample’s context. For instance, when the model is prompted with topics that are ‘highly represented in data’ like Miley Cyrus, Lord of the rings, etc, it is able to generate reasonable samples 50% of the time. On the other hand, the model performs poorly in case of highly technical or complex content.
The OpenAI team has specified that it envisions the use of GPT-2 in development of AI writing assistants, advanced dialogue agents, unsupervised translation between languages and enhanced speech recognition systems. It has also specified the potential misuses of GPT-2 as it can be used to generate misleading news articles, and automate the large scale production of fake and phishing content on social media.
Due to the concerns related to this misuse of language generating models, OpenAI has decided to release a ‘small’ version of GPT-2 with its sampling code and a research paper for researchers to experiment with. The dataset, training code, or GPT-2 model weights have been excluded from the release.
The OpenAI team states that this release strategy will give them and the overall AI community the time to discuss more deeply about the implications of such systems. It also wants the government to take initiatives to monitor the societal impact of AI technologies and to track the progress of capabilities in these systems. “If pursued, these efforts could yield a better evidence base for decisions by AI labs and governments regarding publication decisions and AI policy more broadly”, states the OpenAI team.
Public reaction to the news is positive, however, not everyone is okay with OpenAI’s release strategy, and feels that the move signals towards ‘closed AI’ and propagates the ‘fear of AI’:
I'm impressed with the paper but I'm more impressed with their decision not to publish the code, pretrained model, or dataset. Do we really have to sacrifice transparency in the name of safety? Does OpenAI's decision signal progress towards Closed AI? https://t.co/ZiHPj7elO5
— Chip Huyen (@chipro) February 14, 2019
OpenAI's decision to not release the full GPT-2 is interesting, and a cool experiment in and of itself. If the only result is that the broader research community starts discussing malicious use of AI and whether this model was worth this level of caution, it would be worth it! https://t.co/zIxEdySI8p
— Eric Jang (@ericjang11) February 15, 2019
Wow, that’s pretty cool. How long until we get the first AI-authored fiction novel?
— Simon Merton (@SimonRMerton) February 14, 2019
What you are doing is opposite of open. It is unfortunate that you hype up +propagate fear + thwart reproducibility+scientific endeavor. There is active research from other groups in unsupervised language models. You hype it up like it has never been done before. @jackclarkSF
— Anima Anandkumar (@AnimaAnandkumar) February 15, 2019
It’s a win-win-win for OpenAI. (1) It’s good work. (2) Not releasing the model artificially (and unnecessarily) inflates this perspective while simultaneously prohibiting replication. (3) Get to talk about saving the world by ramping up the fear of AI.
— Mark O. Riedl (@mark_riedl) February 14, 2019
For more information, check out the official OpenAI GPT-2 blog post.
Read Next
OpenAI charter puts safety, standards, and transparency first
OpenAI launches Spinning Up, a learning resource for potential deep learning practitioners
OpenAI builds reinforcement learning based system giving robots human like dexterity