Google-Landmarks, a novel dataset for instance-level image recognition

Image retrieval and image recognition are fundamental problems in the machine learning and computer vision world. Image classification technology has shown remarkable progress over the past few years. An obstacle in this research, however, is the unavailability of large annotated datasets. Google has made an attempt to solve this challenge by introducing Google-Landmarks, a worldwide dataset for recognition of human-made and natural landmarks.

This dataset was made with the intention of solving fine-grained and instance-level recognition problems. Examples of this include identifying important landmarks in images (Eiffel Tower, Mount Fuji, Taj Mahal, etc), which accounts for a large portion of what people like to photograph. Landmark recognition can help predict landmark labels directly from image pixels to help people better understand and organize their photo collections. The Google-Landmarks dataset contains more than 2 million images depicting 30 thousand unique landmarks from across the world, a number of classes that is almost 30x larger than what is available in commonly used datasets.

google-landmarks-novel-dataset-instance-level-image-recognition-img-0

Geographic distribution of landmarks in the Landmark dataset

Google has also open-sourced Deep Local Features DELF, an attentive local feature descriptor, which is useful for large-scale instance-level image recognition, in order to advance research in this area. DELF detects and describes semantic local features which can be geometrically verified between images showing the same object instance. It is also optimized for landmark recognition.

Google-Landmarks is being released as part of the Landmark Recognition and Landmark Retrieval Kaggle challenges. The Landmark recognition challenge calls for developers to build models that recognize the correct landmark (if any) in a dataset of challenging test images. In the retrieval challenge, developers are given query images and for each query, they are expected to retrieve all database images containing the same landmarks (if any). Participants are encouraged to compete in both these challenges as the test set for both the problems is same. Participants may also use the training data from the recognition challenge to train models which could be useful for the retrieval challenge. However, there are no landmarks in common between the training/index sets of the two challenges.

This challenge is the focal point of the CVPR’18 Landmarks workshop. More details of the challenge and the dataset can be found in the Google research blog.