Most Popular
1500 questions
7
votes
1 answer
Corpus of human-scored machine translations?
Most of the parallel corpora (Opus, EuroParl, OpenSubtitles) have only human translations.
src_txt | human_trans_of_src_txt_into_trg_lng
(We assume all the translations are good.)
Is there a corpus of machine translations, annotated with human eval…
Adam Bittlingmayer
- 318
- 1
- 13
7
votes
1 answer
Dataset of number of car accidents per cause in the United States
I am looking for a dataset that lists the number of car accidents broken down by causes in the United States. By cause I mean the direct cause of the accident such as failure to stop at red light, or changing lane without checking for the blind…
Franck Dernoncourt
- 7,780
- 9
- 39
- 86
7
votes
3 answers
Where can I get a sample dataset for A/B split testing?
I'm working on A/B split testing now. Where can I get a sample dataset for A/B split testing?
Preethi
- 73
- 1
- 3
7
votes
2 answers
Required a audio format baby crying data set
For my undergraduate research project, I'm trying to train my system with infant child crying frequencies and predict the reason for a new baby crying sound. I'm following this journal and they are using 78 different crying sounds manually…
Biruntha Gnaneswaran
- 71
- 1
- 4
7
votes
2 answers
MIMIC-III severity score
Is there any severity score in the MIMIC-III beyond SOFA? I've found many itemdid with APACHE IV label but I did not retrieve any information in the chartevents. Best
alexandreliborio
- 367
- 2
- 5
7
votes
1 answer
Corpus of documents with important sentences marked
I'm looking to create a sentence extraction program, so a program that aims to get the most important sentences from a body of text. The first step for me is to try to evaluate what characteristics important sentences share, and how they are…
John Madden
- 201
- 1
- 7
7
votes
2 answers
Lake Victoria bathymetric data
I am trying to find bathymetry for Lake Victoria (or portions).
Any GIS format and almost any resolution will do.
As a last resort a hydrographic chart will suffice.
https://gis.stackexchange.com/questions/116738/lake-victoria-bathymetric-data
If you do not know- just GIS
- 294
- 1
- 9
7
votes
1 answer
List of political blogs
I'm analyzing political texts and I need the writings of random political authors.
So far I found a dataset of four political blogs normalized for text analysis. But that's not enough.
I thought about RSS of political blogs. Feedly and Wordpress…
Anton Tarasenko
- 3,641
- 4
- 20
- 34
7
votes
1 answer
Initial public offerings
All pre-IPO companies fill the SEC form S-1. It mentions underwriters, offering price, executives, and financials. Example for Etsy.
This information is available via SEC and elsewhere, but not structured.
Have you seen a structured dataset of these…
Anton Tarasenko
- 3,641
- 4
- 20
- 34
7
votes
1 answer
How do I find out how much doctors get reimbursed by medicare for certain procedures?
Financial incentives are often an important driver of people's actions. I want to better understand how different medical procedures get reimbursed in the US by medicare. Unfortunately I couldn't find the price list via Google myself.
Is there a…
Christian
- 661
- 3
- 9
7
votes
4 answers
Where can I find a cost of living index by zip code?
I am particularly interested in housing costs; specifically 1-bedroom apartment rentals.
For clarity, I mean a database that's free (as in beer) to query with an arbitrarily large number of zip codes.
arschie
- 129
- 1
- 1
- 3
7
votes
3 answers
Geodata to make a map of the UK with counties outlined
I will be using AngularJS and D3.js, but that's not really important.
This blog post by the developer of D3.js, which is a fantastic JavaScript charting library, shows how he made a map of the UK from open source data and delimited each of the…
Mawg says reinstate Monica
- 793
- 1
- 4
- 16
7
votes
0 answers
User profiles from professional social network
I'm doing a project in which I need data about users in a professional social network. I need information related to the experience, education, or skills of the users. So far I've tried LinkedIn but the API does not allow getting this kind of…
davidivad
- 171
- 3
7
votes
2 answers
Macro Indicators of Economic Data by ZIP Codes or Cities in the US
I am looking for a tabular (preferably) dataset or a website's API that contains the economic data of zip codes, counties or cities in the US. This dataset has to go back at least to 2003 (it's not strictly required but being recent data is a must).…
wacax
- 1,042
- 1
- 8
- 23
7
votes
5 answers
Where can I find data on the winner of the presidential popular vote by U.S. county, for as many elections as possible?
I'm trying to find data on the breakdown of the popular vote in each U.S. county, for each presidential election going back as far as possible. I realize that counties change over time, but are these data available somewhere in machine-readable…
Michael A
- 529
- 3
- 18