Skip to main content

Google Scholar - finding datasets made easy

If you were ever in charge of building a Proof of Concept or visualization contest, you know there is an ocean of websites dedicated to hosting interesting data sets. There are so many options nowadays that it's overwhelming and one doesn't even know where to start. Another issue with the proliferation of websites dedicated to datasets is that data for a specific topic is spread across multiple websites, at a different granularity, different time frames or all of the above. What Google aims to achieve is a single repository where you could search for a specific topic, and you'd see all of the associated datasets across the various websites.

In order to be a part of the search results, Google suggests that some guidelines are followed (found: here) and encourages data providers to conform to the open standard (schema.org) so that a robust dataset ecosystem thrives and proves valuable for scientists, data journalists, and geeks.

I spent some time working with the Dataset Search and loved how I could easily find datasets on local government, environment, and social sciences through a UI that we are all familiar with. The datasets were mostly free (some were pay to play) and many times there were multiple sources listed if one of the links was no longer valid. I think this project as a lot of room to run and look forward to an open standard where companies are willing to share their data for the greater good.

 

Take a look! 

https://scholar.google.com/

Share this post

Comments (0)

Leave a comment

Restricted HTML

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.