Google recently unveiled its dataset search engine. In a blog post, they revealed their new service is targeted towards users who need to quickly find and utilise datasets. This is aimed at reducing the hassle of searching through thousands of results and sifting through the clutter. Google in the past has offered many other specialised search services, including search engines for patents, scholarly articles, images, and much more. The introduction of the dataset engine is indicative of a trend towards openly accessible datasets, sought by several industries including the artificial intelligence community.
The search engine itself builds upon the existing Google platform. It indexes millions of datasets across websites. With data being the oil of the 21st-century, it is no surprise that Google (who has itself thrived on models of data utilisation) has chosen to encourage greater accessibility for publicly accessible data. Much like its previous services, it has provided a series of guidelines for system administrators to make their datasets visible to Google servers. With datasets being offered across multiple websites, including Data.World and GitHub, which allow third-party users to contribute their own data to the public, this is another step towards greater data accessibility.
It would appear that the dataset search engine is a byproduct of some of Google’s value-added services for developers. Already, they offer Application Processing Interfaces (APIs) to utilise their services in external applications that can be directly queried within applications. This includes BigQuery that gives developers the opportunity to take advantage of such queries and present them within their own applications. It is likely that, to offer competitive services that will challenge the likes of Amazon Web Services and Microsoft Azure, this would likely become a specialised option.