Most Popular

1500 questions
9
votes
5 answers

Does any City have POIA (Public Online Information act) laws in place?

A few years ago, Sunlight Foundation advocated for the Public Online Information Act (POIA). Do you know of any city that has adopted such a law or statue? In the age of the Internet, government is transparent only when public information is…
noneck
  • 275
  • 1
  • 4
9
votes
2 answers

Anyone have a good way of comparing two large and unstructured lists (~2k entries each) for commonalities between them?

Let's say, two governments have separate lists of reports, and I'm trying to figure out the most common elements between them. Think of one government administrator reporting # of sick days per year and another government administrator reporting…
Ian Kalin
  • 427
  • 3
  • 8
9
votes
0 answers

Protected health information in different countries

I'm looking for a dataset that would gather the different categories of protected health information (PHIs) in different countries. I am only interested in electronic health records. For example, in the United States, PHIs are defined in Health…
Franck Dernoncourt
  • 7,780
  • 9
  • 39
  • 86
9
votes
6 answers

Publicly Available UK Datasets

With no disrespect to our fellow data scientists over "the pond", one major problem from the rest of the world's point of view with questions like Publicly Available Datasets is the American focus of the answers. They are great answers, and great…
Marcus D
  • 1,119
  • 1
  • 9
  • 26
9
votes
6 answers

Open Address Data for Restaurants

I am looking for any open data set that maintains restaurants (or places in general) with their address. If a place closes I am hoping this database will reflect that quickly.
9
votes
4 answers

What are the data quality measures for open data?

How does a consumer know they are getting good data? Are there standard frameworks for grading the quality of an open data set? Should there be metrics published around accuracy, completeness, timeliness or validity of the data? Should there be a…
Tom Zellers
  • 151
  • 1
9
votes
2 answers

Shapes of ZIP codes - polygons for each ZIP code

I'm looking for an open data set that contains polygons with the shape of each ZIP code in the US. GeoJSON format would be ideal. I've done a bunch of searching, and I've been able to find shape data for counties, but not for ZIP codes. Searching…
D.W.
  • 193
  • 1
  • 6
9
votes
3 answers

Is there a list of all utilities that offer the Green Button Download and Green Button Connect?

I see the Green Button data standard on greenbuttondata.org. But I am trying to figure out which geographic regions have the standard ALREADY deployed.
Ian Kalin
  • 427
  • 3
  • 8
9
votes
2 answers

Disease Symptom Dataset?

I am currently working on a disease diagnosis system, it is a prototype based on one of my dissertation papers S-Approximation: A New Approach to Algebraic Approximation and S-approximation Spaces: A Three-way Decision Approach. Up to now, I have…
Ali Shakiba
  • 191
  • 1
  • 4
9
votes
1 answer

Football (Soccer) Player x,y,t data

I am looking for something that has the (x,y) coordinates of all players on the pitch over time. I found this: https://heim.ifi.uio.no/paalh/publications/files/mmsys2014-dataset.pdf however, they only track the movements of the home team. I found…
anonymous
  • 91
  • 1
  • 3
9
votes
1 answer

Cloud providers performance dataset

I need to perform some machine learning algorithms on a data set that contains cloud providers' performance. I need some of the following information : availability, input/output per second, max restore time, processing time, latency with internal…
nazbouy
9
votes
2 answers

Movie Script Database

I'm looking for a database of movie scripts to use to train a ChatterBot application. I saw this article that mentioned that a database of movie scripts was used to train the program that generated a selection of…
Gunther
  • 193
  • 1
  • 5
9
votes
1 answer

License of NOAA hourly temperature data

I would like use historic hourly temperature data (Integrated Surface Global Hourly Data, DSI-3505) available from NOAA. Using NOAA's Climate Data Online mapping tool, I select the weather stations in the region I'm interested in (Belgium, in my…
Brecht Machiels
  • 241
  • 1
  • 4
9
votes
2 answers

Machine-readable way to determine if plant species is native to Australia

I'm doing some work visualising tree inventories and thought it would be nice to show whether the species is native to Australia or not. I haven't so far found any database that would answer the question. Wikipedia has a category "Flora of…
Steve Bennett
  • 850
  • 5
  • 12
9
votes
1 answer

Legality of using data against terms

Perhaps a bit off-topic, but this is the only relevant SE forum. I sometimes come across RSS feeds with terms and conditions attached, such as one example permitting use for app-development (that would complement their revenues) but expressly not…
geotheory
  • 459
  • 3
  • 9