Most Popular
1500 questions
7
votes
3 answers
Where can I find datasets of mailing list archives of open source software?
I plan to mine the mailing list archives of any open source software to answer interesting research questions.
How can I request for the data?
What is the procedure?
Are any small datasets of the mailing list archives available to perform a test…
Hemaa mathavan
- 315
- 1
- 10
7
votes
2 answers
Medical Terminology in Patient Medical Records - Public Data Sets
I am interested in sample data of real patient medical records (anonymized or demographics removed completely) for the purpose of running through NLP system - specifically diagnoses, admissions and progress notes - anything where medical terminology…
DataMania
- 173
- 5
7
votes
1 answer
Where can I find data for Formula 1 races and race cars
I am looking for dataset on the outcomes of Formula 1 races, what cars partook in the race, and the specifications of these cars (such as type of tire used; type of engine; width, length and other parameters that describe the shape of the car).
If…
Ragnar
- 235
- 1
- 4
7
votes
5 answers
Which format (CSV, JSON, Atom, RSS?) should events data be published in?
I'm developing recommendations for local councils publishing event listings. Compared to other kinds of data, events data seems very likely to be used by web and mobile apps (as opposed to downloaded for analysis), and is inherently chronologically…
Steve Bennett
- 850
- 5
- 12
7
votes
1 answer
19th Century Patent Data
Any 19th century patent data with geo-referenced locations of the submitter? So, for each city or county, it lists the number of patents filed in a given year/month etc?
LJB
- 639
- 1
- 4
- 13
7
votes
4 answers
Best practices for huge explorable linked data directories
The main requirements for our open data directory are:
XML-based by default with ability to switch to JSON. It's important the easy way to make it human readable with just linking to XSL.
All the data must be reachable by robot from the single…
Denis Otkidach
- 171
- 3
7
votes
4 answers
Dataset for emotion classification
I'm looking for a dataset for moods or emotions (Happy, Angry, Sad) classification. That's to classify the sentiment of a given text. I would like to use Naive Bayes classifier for this analysis. Not only to train and test the model with the…
SOURAV
- 193
- 2
- 6
7
votes
3 answers
Any APIs available that provide data of Indian vehicles?
I was looking for APIs that provide current latest data of vehicles (2-wheelers/4-wheelers) in India. I found quite a few but none had data of Indian vehicles. I looked at this question which is almost same as mine but couldn't get any help…
Amogh Natu
- 171
- 1
- 1
- 3
7
votes
2 answers
Batch conversions of lat, lon to US census tract?
I have 700,000 latitude/longitude pairs I need to convert to US Census tracts. Is there a free API that offers this in batches? So far the only option I have found is from the FCC and does not state a rate limit but has the form of a 1-1 call to…
sunny
- 292
- 3
- 5
7
votes
2 answers
Can I get 1000 images from any image search engine for education/research purpose?
I'm researching on machine learning system that learns to recognize items based on image search results from search engine. After I searched around I found that Google and Bing Image Search api allow only small number of images and doesn't allow…
PtLearner
- 73
- 2
7
votes
2 answers
GIThub to share a set of SPARQL queries
I am using github to share a set of SPARQL queries:
http://www.boisvert.me.uk/opendata/sparql_aq+.html?file=specific%20sensor.txt
Currently the simple work allows end-users to access queries stored on the github repository, but ultimately I want to…
boisvert
- 209
- 1
- 7
7
votes
1 answer
Database of adult sites
I would like to ban all adult content in my DNS/VPN service and I wouldn't like to outsource this. Is there a list of URLs I can use as a blacklist in my routers/servers?
Format doesn't matter and if it would be actively maintained that would be…
CleanTheWeb
- 71
- 1
- 4
7
votes
2 answers
Are there any open datasets with technical specifications for photographic equipment?
I want to set up a free service where photographers buying or searching for discontinued photographic equipment can reference for technical specifications. For instance, most eBay-listings of second-hand equipment are without any tech. specs…
user135
7
votes
2 answers
Creating data from web tables with import.io failed - other tools?
I found this site with solar system moon orbit data: Table of moons in solar system.
I ran that through the http://import.io site and it only came up with Jupiter data. Is there a more comprehensive tool that will identify multiple tables and…
John Carlson
- 231
- 1
- 3
7
votes
2 answers
Database of smartphone sensor data
I'm working on a machine learning project for classifying activity level (walking, running, sitting etc) based on smartphone accelerometer, gyroscope, and gps data
Of course I can just collect this data myself but this is very time consuming. I'm…
Simon
- 171
- 4