Most Popular

1500 questions
6
votes
2 answers

German news/text data set

I am looking for (possible buy) a dataset of German news data (i.e., the largest daily newspapers), where the data span at least ten years back up until now/recent. Any tips?
Sverker
  • 61
  • 2
6
votes
1 answer

Looking for open source LGBT datasets

I have been looking for any general datasets about LGBT Americans. The type of file I look for are either CSV or TSV. Although my inquiry will hopefully yield some spatial indices at the county or state level, because of the apparent lack of initial…
6
votes
2 answers

Open datasets of lottery winning numbers

I saw only paid feeds for lotteries games. Also would be great to get dataset of virtual betting games
SpanishBoy
  • 211
  • 1
  • 5
6
votes
2 answers

Average income by age group, for all countries?

Is there any data source that lists average income, disposable income, or salary by age group? I would like to get something like the follows. Country | 25-34 | 35-44 | 45-54 | 55 and over Belgium | 30,000 | 33,000 | 46,000 | 52,000 Italy | 23,000…
Blaszard
  • 517
  • 1
  • 3
  • 13
6
votes
1 answer

Kidney transplant dataset

I'm looking for raw data regarding kidney transplant or any organ transplant in order to perform a survival analysis with random effects given by some kind of attributes such as hospital dimension/location.
6
votes
1 answer

Elevation data sources

I need to know for example, heights of bridges/fly-overs from the land surface (just to know there is an elevated structure). How and where from can I get this data (preferably for EU)? Any open source databases/ any 3d maps with this data?
user750066
  • 61
  • 2
6
votes
3 answers

List of United States cities

I'm looking to get some data of this page http://www.greatschools.org/california/san-francisco/schools/?gradeLevels=e&page=2 To do it I need to create the links in this format: http://www.greatschools.org/'state'/'city' So at first I need to have a…
6
votes
2 answers

Open Web Crawling Dumps

I am currently using the Common Crawl dump of web crawl data. Are their any other good providers of free web crawling data that I can blend with my Common Crawl data?
6
votes
1 answer

MIMIC-III Elixhauser comorbidity table

How can I obtain a table containing the 30 Elixhauser comorbidities for patients in the MIMIC-III database? Did anybody already run the code in the MIMIC-III forum?
6
votes
2 answers

US Residential Mailing Addresses Databases

What I'm looking for I can almost guarantee I won't find openly available anywhere but I'll try anyway. Simply, I'm looking for a list of residential addresses of a given county or city/municipality in the USA for personal use. I can't find anything…
Jared Eitnier
  • 163
  • 1
  • 4
6
votes
2 answers

Downloadable word embeddings

I am looking for downloadable word embeddings (a.k.a. word vectors, distributed word representations). I'm aware of: word2vec GloVe SENNA as well as the retrofitting tool. What else is available?
Franck Dernoncourt
  • 7,780
  • 9
  • 39
  • 86
6
votes
4 answers

Open Source MRI Image Dataset

I'm working on a voxel-based modelling application and one of the features that I've implemented is a method to do a 3D mesh reconstruction from a series of 2D image slices (similar to an MRI). I've got a basic brain scan image set that I've been…
andyopayne
  • 161
  • 2
6
votes
2 answers

New York City weather data

I am looking for daily weather data for New York City. I have searched the NOAA website but could find only the weather data for New York State. It doesn't have many records for New York City Boroughs other than the Manhattan. Please suggest some…
maven25
  • 217
  • 1
  • 2
  • 4
6
votes
2 answers

Do some Kaggle contest organizers remove the data sets after the end of the contest?

I wonder whether Kaggle contest organizers sometimes remove the data sets after the end of the contest, or is that made impossible by Kaggle's policies?
Franck Dernoncourt
  • 7,780
  • 9
  • 39
  • 86
6
votes
1 answer

Why are some PubMed IDs missing?

PubMed IDs (a.k.a. PMIDs) seem to have been assigned sequentially: https://www.ncbi.nlm.nih.gov/pubmed/1 https://www.ncbi.nlm.nih.gov/pubmed/2 https://www.ncbi.nlm.nih.gov/pubmed/3 https://www.ncbi.nlm.nih.gov/pubmed/4 ... However, some some…
Franck Dernoncourt
  • 7,780
  • 9
  • 39
  • 86