Data

Google launches a Dataset Search Engine for finding Datasets on the Internet

2 min read

Google has launched Dataset Search, a search engine for finding datasets on the internet. This search engine will be a companion of sorts to Google Scholar, the company’s popular search engine for academic studies and reports. Google Dataset Search will allow users to search through datasets across thousands of repositories on the Web whether it be on a publisher’s site, a digital library, or an author’s personal web page.

Google’s Dataset Search scrapes government databases, public sources, digital libraries, and personal websites to track down the datasets. It also supports multiple languages and will add support for even more soon. The initial release of Dataset Search will cover the environmental and social sciences, government data, and datasets from news organizations like ProPublica. It may soon expand to include more sources.

Google has developed certain guidelines for dataset providers to describe their data in way that Google can better understand the content of their pages. Anybody who publishes data structured using schema.org markup or similar equivalents described by the W3C, will be traversed by this search engine. Google also mentioned that Data Search will improve as long as data publishers are willing to provide good metadata. If publishers use the open standards to describe their data, more users will find the data that they are looking for.

Natasha Noy, a research scientist at Google AI who helped create Dataset Search, says that “the aim is to unify the tens of thousands of different repositories for datasets online. We want to make that data discoverable, but keep it where it is.

Ed Kearns, Chief Data Officer at NOAA, is a strong supporter of this project and helped NOAA make many of their datasets searchable in this tool. “This type of search has long been the dream for many researchers in the open data and science communities” he said.

Try out Google’s new Dataset Search here.

Read Next

25 Datasets for Deep Learning in IoT.

Datasets and deep learning methodologies to extend image-based applications to videos.

Google-Landmarks, novel dataset for instance-level image recognition.

Sugandha Lahoti

Content Marketing Editor at Packt Hub. I blog about new and upcoming tech trends ranging from Data science, Web development, Programming, Cloud & Networking, IoT, Security and Game development.

Share
Published by
Sugandha Lahoti
Tags: AI News

Recent Posts

Top life hacks for prepping for your IT certification exam

I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…

3 years ago

Learn Transformers for Natural Language Processing with Denis Rothman

Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…

3 years ago

Learning Essential Linux Commands for Navigating the Shell Effectively

Once we learn how to deploy an Ubuntu server, how to manage users, and how…

3 years ago

Clean Coding in Python with Mariano Anaya

Key-takeaways:   Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…

3 years ago

Exploring Forms in Angular – types, benefits and differences   

While developing a web application, or setting dynamic pages and meta tags we need to deal with…

3 years ago

Gain Practical Expertise with the Latest Edition of Software Architecture with C# 9 and .NET 5

Software architecture is one of the most discussed topics in the software industry today, and…

3 years ago