





















































With AI Habitat, the team aims to retain the simulation-related benefits that past projects demonstrated including speeding experimentations and RL-based training, and further applying them to a widely compatible and realistic platform.
AI Habitat features a stack of three modular layers, where each of them can be configured or even replaced to work with different kinds of agents, evaluation protocols, training techniques, and environments. The simulation engine known as the Habitat-Sim forms the base of the stack including built-in support for existing 3D environment data sets, including Gibson, Matterport3D, etc. Habitat-Sim can also be used in abstracting the details of specific data sets and further applying them across simulations.
Habitat-API is the second layer in AI Habitat’s software stack which is a high-level library that defines tasks such as visual navigation and question answering. This API incorporates the use of additional data, configurations and further simplifies and standardizes the training as well as evaluation of embodied agents.
The third and final layer of this platform where users specify training and evaluation parameters, such as how difficulty might ramp across multiple runs and further ask about what metrics to focus on.
According to the researchers, the future of AI Habitat and embodied AI research lies in the simulated environments that are indistinguishable from real life.
In the case of Replica, the FRL (Facebook Reality Labs) researchers created the data set consisting of scans of 18 scenes that range in size, from an office conference room to a two-floor house. The team also annotated the environments with semantic labels, such as “window” and “stairs,” that included labels for individual objects, such as book or plant. And for creating such a data set, FRL researchers used proprietary camera technology as well as a spatial AI technique that’s based on the simultaneous localization and mapping (SLAM) approaches. Replica further captures the details in the raw video, reconstructing dense 3D meshes along with high-resolution as well as high dynamic range textures.
The data used for generating Replica removes any personal details including family photos that could identify an individual. The researchers had to manually fill in the small holes that are inevitably missed during scanning. They also used a 3D paint tool for applying annotations directly onto meshes.
The blog reads, “Running Replica’s assets on the AI Habitat platform reveals how versatile active environments are to the research community, not just for embodied AI but also for running experiments related to CV and ML.”
The researchers held the Habitat Challenge in April-May this year, a competition that focused on evaluating the task of goal-directed visual navigation. The aim was to demonstrate the utility of AI Habitat’s modular approach as well as emphasis on 3D photo-realism.
This challenge required participants to upload the code which was different from the traditional one where usually people upload predictions that are based on a task related to a given benchmark. Also, the code was run on new environments that their agents were not familiar with.
The top-performing teams are Team Arnold (a group of researchers from CMU) and Team Mid-Level Vision (a group of researchers from Berkeley and Stanford).
The blog further reads, “Though AI Habitat and Replica are already powerful open resources, these releases are part of a larger commitment to research that’s grounded in physical environments. This is work that we’re pursuing through advanced simulations, as well as with robots that learn almost entirely through unsimulated, physical training. Traditional AI training methods have a head start on embodied techniques that’s measured in years, if not decades.”
To know more about this news, check out Facebook AI’s blog post.
Facebook researchers show random methods without any training can outperform modern sentence embeddings models for sentence classification
Facebook researchers build a persona-based dialog dataset with 5M personas to train end-to-end dialogue systems
Facebook AI researchers investigate how AI agents can develop their own conceptual shared language