By Ben Schaap
Google has announced Dataset Search which is a search engine that helps users discover datasets from different sources. In a press statement Google says that its operation is similar to its existing Google Scholar service (for academic publications). For resolving the right kind of content and making the search process easy Google reports that they developed guidelines for data providers.
Google says,"We developed guidelines for dataset providers to describe their data in a way that Google (and other search engines) can better understand the content of their pages. These guidelines include salient information about datasets: who created the dataset, when it was published, how the data was collected, what the terms are for using the data, etc. We then collect and link this information, analyze where different versions of the same dataset might be, and find publications that may be describing or discussing the dataset. Our approach is based on an open standard for describing this information (schema.org) and anybody who publishes data can describe their dataset this way. We encourage dataset providers, large and small, to adopt this common standard so that all datasets are part of this robust ecosystem".
This initiative is still in beta but is no doubt useful for all those working with data. It is positive that Google is adopting good practices within the service, for example, applying an open standard for describing the data. It is worth following this development closely to see how Google's Dataset Search will fit with GODAN’s vision of A Global Data Ecosystem for Agriculture and Food and the Farm Data Train.