Two days ago, researchers from Facebook AI Research published a paper titled “CraftAssist: A Framework for Dialogue-enabled Interactive Agents”. The authors of this research are Facebook AI research engineers Jonathan Gray and Kavya Srinet, Facebook AI research scientist C. Lawrence Zitnick and Arthur Szlam and Yacine Jernite, Haonan Yu, Zhuoyuan Chen, Demi Guo and Siddharth Goyal.
The paper describes the implementation of an assistant bot called CraftAssist which appears and interacts like another player, in the open sandbox game of Minecraft. The framework enables players to interact with the bot via in-game chat through various implemented tools and platforms. The players can also record these interactions through an in-game chat. The main aim of the bot is to be a useful and entertaining assistant to all the tasks listed and evaluated by the human players.
Image Source: CraftAssist paper
For motivating the wider AI research community to use the CraftAssist platform in their own experiments, Facebook researchers have open-sourced the framework, the baseline assistant, data and the models. The released data includes the functions which was used to build the 2,586 houses in Minecraft, the labeling data of the walls, roofs, etc. of the houses, human rephrasing of fixed commands, and the conversion of natural language commands to bot interpretable logical forms. The technology that allows the recording of human and bot interaction on a Minecraft server has also been released so that researcher will be able to independently collect data.
Why is the Minecraft protocol used?
Minecraft is a popular multiplayer volumetric pixel (voxel) 3D game based on building and crafting which allows multiplayer servers and players to collaborate and build, survive or compete with each other. It operates through a client and server architecture. The CraftAssist bot acts as a client and communicates with the Minecraft server using the Minecraft network protocol.
The Minecraft protocol allows the bot to connect to any Minecraft server without the need for installing server-side mods. This lets the bot to easily join a multiplayer server along with human players or other bots. It also lets the bot to join an alternative server which implements the server-side component of the Minecraft network protocol. The CraftAssist bot uses a 3rd-party open source Cuberite server. It is a fast and extensible game server used for Minecraft.
Read More: Introducing Minecraft Earth, Minecraft’s AR-based game for Android and iOS users
How does the CraftAssist function?
The block diagram below demonstrates how the bot interacts with incoming in-game chats and reaches the desired target.
Image Source: CraftAssist paper
- Firstly, the incoming text is transformed into a logical form called the action dictionary. The action dictionary is then translated by a dialogue object which interacts with the memory module of the bot. This produces an action or a chat response to the user.
- The bot’s memory uses a relational database which is structured to recognize the relation between stored items of information. The major advantage of this type of memory is the easy to convert semantic parser, which is converted into a fully specified tasks.
- The bot responds to higher-level actions, called Tasks. Tasks are an interruptible process which follows a clear objective of step by step actions. It can adjust to long pauses between steps and can also push other Tasks onto a stack, like the way functions can call other functions in a standard programming language. Move, Build and Destroy are few of the many basic Tasks assigned to the bot.
- The The Dialogue Manager checks for illegal or profane words, then queries the semantic parser. The semantic parser takes the chat as input and produces an action dictionary.
- The action dictionary indicates that the text is a command given by a human and then specifies the high-level action to be performed by the bot.
- Once the task is created and pushed onto the Task stack, it is the responsibility of the command task ‘Move’ to compare the bot’s current location to the target location. This will make the bot to undertake a sequence of low-level step movements to reach the target.
- The core of the bot’s understanding of natural language depends on a neural semantic parser called the Text-toAction-Dictionary (TTAD) model. This model receives the incoming command/chat and then classifies it into an action dictionary which is interpreted by the Dialogue Object.
The CraftAssist framework thus enables the bots in Minecraft to interact and play with players by understanding human interactions, using the implemented tools. The researchers hope that since the dataset of CraftAssist is now open-sourced, more developers will be empowered to contribute to this framework by assisting or training the bots, which might lead to the bots learning from human dialogue interactions, in the future.
Developers have found the CraftAssist framework interesting.
That's really cool!
— Djamé (@zehavoc) July 18, 2019
A user on Hacker News comments, “Wow, this is some amazing stuff! Congratulations!”
Check out the paper CraftAssist: A Framework for Dialogue-enabled Interactive Agents
for more details.