Most Popular

1500 questions
5
votes
1 answer

Open data for chemical substances, structures and products?

I'm looking for large open data which provides chemical compounds, substances, structures and products. Database of chemical reactions, formulas and similar would be beneficial as well. Some academia and industrial databases (not open data): The…
kenorb
  • 431
  • 4
  • 10
5
votes
2 answers

Searching for a Mergers and Acquisitions (M&A) panel data set.

I am looking for sources of available datasets on M&A. My preference would be for a clean panel dataset that might have previously been used for the purpose of studying the effect of M&A on either firm profits, or their R&D activity. I know that…
Seb
  • 225
  • 2
  • 5
5
votes
2 answers

Open replacement for cfbstats.com NCAA football CSV's

http://www.cfbstats.com/ used to publish awesome, comprehensive statistics for college football as CSV's. It looks like all their links (including from the blog post announcing opening their data) now point to…
Chris
  • 255
  • 2
  • 7
5
votes
1 answer

National Scale (contiguous US) weather data set for 1980 - 2010

Does anyone know of any national weather datasets that cover the contiguous US for the period 1980 to 2011? I need gridded values, preferably with a grid size (resolution) of 1 km x 1 km up to (maybe) 10 km x 10 km. I don't want point sites nor do…
traggatmot
  • 236
  • 1
  • 6
5
votes
3 answers

Data sets for evaluating identity resolution

We are looking for a data set or data sets to test record linkage/identity resolution. The target problem is matching up customers to people on various watch lists. We want to test the basic stuff that everyone does like edit distance, Jaccard…
ahoffer
  • 161
  • 6
5
votes
4 answers

Tagged addresses

Does anyone know of a corpus of US addresses where the parts of the addresses have been tagged. Something like 123 E. Main
fgregg
  • 5,108
  • 16
  • 37
5
votes
3 answers

Standards for documenting gaps in data?

Often, we have catalogs of events or features that are based off of a time-series of data that has gaps. I was wondering if there were any standards for describing those gaps, and what type of gap it might be, eg: there will never be data for…
Joe
  • 4,445
  • 1
  • 18
  • 40
5
votes
1 answer

Product Reviews

I'm looking for free datasets of product reviews. At minimum I need the following values for each record: Rating (e.g. 2/5 stars, or 40%) Product Name / Description (e.g. iPhone 5s) Reviewer (e.g. Wired magazine) Any additional values would be a…
ninjaPixel
  • 151
  • 5
5
votes
1 answer

IRS Codes in machine readable format

Are there any sources of machine readable IRS codes (e.g. state, country etc.)? For example, on the 1099-B instructions, the 1f code on this form. However I would like avoid having to scrape the pdf document for this information. To be clear, this…
Ryan Gates
  • 393
  • 3
  • 12
5
votes
1 answer

How can I retrieve a list of companies that deal in a specific field?

I would like to find a list of company that deal with a specific sector of industry. I don't know how I can retrieve these data, I know that probably there is a list in Chambers of Commerce but I know that it could be very expensive. In my specific…
G M
  • 173
  • 4
5
votes
1 answer

Where can I get historic prices for a commodity?

Is there any way to get the price of a commodity over the last 100 years? (Specifically, I'm looking for the price of toilet paper; I want to chart it and try to estimate the year where a piece of toilet paper will be worth more than $1.)
5
votes
1 answer

Standardized tests questions databases

Are there open databases of questions from standardized tests (e.g. SAT, GMAT, GRE, etc)?
Filipe Ferminiano
  • 377
  • 1
  • 3
  • 7
5
votes
2 answers

Seeking Water Quality Data for Lake Ontario that includes Dissolved Oxygen, Nitrogen, Phosphorus?

I was wondering what sources are available, or if anyone has GIS water quality data for the Great Lakes and specifically Lake Ontario? We are interested in mapping Phosphorus, Organic Matter, Nitrogen and Dissolved Oxygen concentrations in Lake…
5
votes
1 answer

Scraper for Openstreetmap: all south-american schools to mysql-db

I have seen the python scraper described here. Well this is very interesting. I am not a programmer and therefore some of the techniques and ideas of scrapers are too complicated for me. But it seems to be the tool of choice: I need a scraper that…
zero
  • 361
  • 1
  • 11
5
votes
1 answer

How should I categorize municipal legislation?

I'm working on a site to help Chicagoans understand what their city council is doing. As part of that effort, I would like to roll up the many different types of legislation into a handful of comprehensible bins. I know that cities often delegate…
fgregg
  • 5,108
  • 16
  • 37