BabyAI: A research platform for grounded language learning with human in the loop, by Yoshua Bengio et al

4 min read

Last week, researchers from the University of Montreal, University of Oregon, and IIT Bombay published a paper, titled, ‘BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop’ that introduces, BabyAI platform. This platform provides a heuristic expert agent for the purpose of simulating a human teacher and also supports humans in the loop for grounded language learning.

BabyAI platform uses synthetic Baby Language for giving instructions to the agent. The researchers have implemented a 2D gridworld environment called MiniGrid for convenience, as it’s simple and easy to work on.

The BabyAI platform includes a verifier that checks if an agent performing a sequence of actions, in a given environment, has successfully achieved its goal or not.

Why BabyAI platform is introduced?

It’s difficult for humans to train an intelligent agent for understanding natural language instructions. No matter how advanced AI technology becomes, human users would always want to customize their intelligent helpers to be able to understand their desires and needs, better.

The main obstacle in language learning with a human in the loop is the amount of data that would be required. Deep learning methods, used in the context of imitation learning or reinforcement learning paradigms, could be effective but even they require enormous amounts of data. This data could be in the form of millions of reward function queries or thousands of demonstrations.

BabyAI platform works on data efficiency, as the researchers measure the minimum number of samples required to solve several levels with imitation and reinforcement learning baselines. Also, the platform and pretrained models will be available online. This surely will improve data efficiency of grounded language learning.

The Baby Language

Baby Language is a combinatorially rich subset of English, designed to be easily understood by humans. In this language, the agent can be instructed to go to the objects, pick them up, open doors, and put objects next to the other ones. The language can also express the combination of several such tasks. For example “put a red ball next to the green box after you open the door”.

In order to keep the instructions readable by humans, the researchers have kept the language simple. But the language still exhibits interesting combinatorial properties and contains 2.48 × 1019 possible instructions. There a couple of structural restrictions on this language including:

  1. The ‘and connector’ can only appear inside the ‘then and after forms’
  2. Instructions can contain only one ‘then’ or ‘after’ word.

MiniGrid: The environment that supports the BabyAI platform

Since the data-efficiency studies are very expensive as multiple runs are required for different amounts of data, the researchers have aimed to keep the design of the environment, minimalistic. The researchers have implemented MiniGrid, an open source, partially observable 2D gridworld environment. This environment is fast and lightweight. It is available online and supports integration with OpenAI Gym.

Experiments conducted and the results

The researchers assess the difficulty of BabyAI levels by training an imitation learning baseline for each level. Moreover, they have estimated as to how much data is required to solve some of the simpler levels. They have also studied to what extent can the data demands be reduced with the help of basic curriculum learning and interactive teaching methods.

The results suggest that the current imitation learning and reinforcement learning methods, both scale and generalise poorly when it comes to learning tasks with a compositional structure. Thousands of demonstrations are needed to learn tasks which seem trivial by human standards. Methods like curriculum learning and interactive learning can provide measurable improvements in terms of data efficiency. But for involving an actual human in the loop, an improvement of at least three orders of magnitude is required.

Future Scope

The direction of future research is towards finding strategies to improve data efficiency of language learning. Tackling such a challenge might require new models and teaching methods. Approaches that involve an explicit notion of modularity and subroutines, such as Neural Module Networks or Neural Programming Interpreters also seems to be a promising direction.

To know more about BabyAI platform, check out the research paper published by Yoshua Bengio et. al.

Read next

Facebook open sources QNNPACK, a library for optimized mobile deep learning

Facebook’s Child Grooming Machine Learning system helped remove 8.7 million abusive images of children

Optical training of Neural networks is making AI more efficient