Most Popular
1500 questions
13
votes
3 answers
What does OpenRefine offer that other data-parsing tools don't?
I see OpenRefine mentioned a lot here, but I don't see it doing much that R and others can't. What capabilities does it offer that I'm not seeing in the promo page that R or other data packages cannot?
Ari B. Friedman
- 295
- 2
- 8
13
votes
2 answers
List of abbreviations and acronyms
I am searching an list of abbreviations,acronyms which should be downloadable as a sql table or json or sth. like that so no api cause it might be not fast enough or has a limit like this one: http://www.abbreviations.com/abbr_api.php
Of course the…
Wikunia
- 335
- 2
- 9
13
votes
1 answer
Where can I find massive and high dimensional survival datasets
I am working on developing some high-dimensional survival analysis methods with R, but I do not know where to find such high-dimensional survival datasets.
Could anyone tell me where to find such datasets, for examples the data used in:
"Predicting…
floodking
- 231
- 1
- 2
13
votes
1 answer
Searching for Open Data Dataset That is No Longer Online
Many questions are posted here in search of a specific dataset that is either no longer online, or has died from linkrot, or a combination of the two. What is the best way to get around this? Or rather, can these datasets be recovered?
albert
- 11,885
- 4
- 30
- 57
13
votes
2 answers
Dump of WikiLeaks
Does a dump or scrape of WikiLeaks exist? I'm thinking of an equivalent to Wikipedia's database download: http://en.wikipedia.org/wiki/Wikipedia:Database_download
So far, I haven't found direct access to its publicly released data. It seems…
szxk
- 810
- 6
- 13
13
votes
7 answers
Is there a free downloadable administrative division database of Germany?
Is there downloadable and freely available database with administrative units of Germany (lands, cities, and if available, streets with zip codes)?
In many countries such databases are provided freely by central statistical offices, but for example,…
user139
13
votes
5 answers
Cost of living dataset
I'm trying to compare what the equivalent salary would be between two cities based upon the cost of living in each city.
I want to be able to build something like CNN's cost of living…
greenJavaDev
- 233
- 1
- 2
- 5
13
votes
4 answers
Dataset of sentences translated into many languages
I'm looking for a dataset of human translated sentences.
The ideal dataset would look like this:
1, en, The weather is nice today.
1, de, Das Wetter ist heute schön.
1, es, El clima es agradable hoy.
1, el, Ο καιρός είναι καλός σήμερα.
...
for as…
philshem
- 17,647
- 7
- 68
- 170
13
votes
5 answers
Airport / airline data from all over the world
Where can I get a database with airports and possible with (available / closed) runways from all over the world?
I am looking for airlines and contact info of managers in decision-making positions at airlines too.
János
- 899
- 8
- 20
13
votes
2 answers
Rocket attacks dataset in Israel and State of Palestine
I'm looking for a dataset listing the rocket attacks in Israel and the State of Palestine with as many following fields as possible:
timestamp
GPS
number of casualties
reason for attack (e.g. a pointer to a previous attack)
number of articles…
Franck Dernoncourt
- 7,780
- 9
- 39
- 86
13
votes
4 answers
Releasing old historical/genealogical datasets as open data
I work with a a couple of small non-profit genealogical and historical groups and we are interested in releasing some of the datasets we've compiled over the years as open data. This information is already freely searchable through our online…
Asparagirl
- 486
- 4
- 7
13
votes
1 answer
How to construct a database with the underlying real estate data displayed by Redfin, Zillow, or Trulia?
Regardless of whether the home is for sale, if you type any street address into Zillow, Redfin, or Trulia, they will often tell you the square footage, the last-sold-date, the taxable value, and often some other official information. Here is one…
Anthony Damico
- 1,480
- 10
- 16
13
votes
7 answers
Dataset of domain names
There are many web resources to find domain names (whois.com), and using the WHOIS protocol there are some APIs. Some examples are the unix command line tool jwhois and the python library pywhois. These tools return the full WHOIS record, which…
philshem
- 17,647
- 7
- 68
- 170
13
votes
4 answers
A dataset of resumes
This is a question I found on /r/datasets. Does OpenData have any answers to add?
I'm looking for a large collection or resumes and preferably knowing whether they are employed or not. Does such a dataset exist?
Link to reddit post
philshem
- 17,647
- 7
- 68
- 170
13
votes
2 answers
Clickstream sample dataset
I am looking for some web traffic or clickstream dataset, ideally from an ecommerce website. I like to do some analysis on purchasing pattern if possible.
For example: visit duration, conversion, shopping cart abandonment, cross-category shopping,…
Hawk
- 131
- 1
- 1
- 3