DeepMind introduces NarrativeQA: A real-world dataset for testing the limits of Reading Comprehension

DeepMind introduces NarrativeQA, a data repository setup for understanding complex narratives.

Reading comprehension (RC)—in contrast to information retrieval—requires integrating information and reasoning about events, entities, and their relations across a full document. The question answering technique is traditionally used to assess the abilities of RC, both in AI agents and in children who are learning to read.

However, DeepMind surveyed that the existing RC datasets such as MCTest, Children’s Book Test(CBT), CNN/Daily Mail, NewsQA, SearchQA, and so on and found out certain limitations which include, presence of small datasets, unnatural data, requirement of a single sentence of information to answer the questions, and so on. Hence, these RC datasets are unable to test an important integrative aspect of machine’s Reading Comprehension.

In order to encourage deeper comprehension of language, DeepMind presents a brand new dataset and a set of tasks, known as the NarrativeQA.

This dataset includes fictional stories, which are 1,567 complete stories from books and movie scripts, with human written questions and answers based solely on human-generated abstract summaries.

The dataset is divided into three parts:

non-overlapping training
validation and
testing

There are 46,765 pairs of answers to questions written by humans and includes mostly the more complicated variety of questions such as "when / where / who / why". This dataset permits the training of neural network-based models over word embeddings and provide decent lexical coverage and diversity.Thus, this dataset would test and reward agents that approach human level of competency.

Having given a quantitative and qualitative analysis of the difficulty of the more complex tasks, DeepMind suggests research directions that may help bridge the gap between existing models and human performance. DeepMind also hopes that this dataset will serve not only as a challenge for the machine reading community, but also as a driver for the development of a new class of neural models which will take a significant step beyond the level of complexity which existing datasets and tasks permit.

To have a detailed understanding on the working of NarrativeQA dataset, you can have a look at the research paper here.