Most Popular

1500 questions
5
votes
2 answers

Looking for modern english texts in public domain / CC-BY-SA

Is there a library of public domain / CC-BY-SA english texts, written later than 1990? I need it to build a corpora with links to full texts, which isn't possible with copyrighted material. At the same time I would like this corpora to be up-to-date…
Denis Kulagin
  • 291
  • 2
  • 5
5
votes
0 answers

High School Football Stats

I am in a application development course (CSC 400), required for my Computer Science major, and have been asked to create a "web scrapper" utility that harvests data of high school football statistics. I have been searching for APIs but cannot find…
Jim22150
  • 151
  • 2
5
votes
2 answers

Open text document corpus for information retrieval evaluation

INTRODUCTION: document collections (corpora) for evaluation of information retrieval (search engine) systems are pretty often behind a paywall. A notorious example is the TREC conference (http://trec.nist.gov/). Apart from money, they ask for…
5
votes
1 answer

Dataset of large graphs for classifiction

I want to evaluate a graph kernel designed for large graphs (> 10^6 nodes). Hence, I'm looking for suitable graph data sets, i.e., a set of (huge) graphs and corresponding classes. Any ideas?
Christopher
  • 151
  • 2
5
votes
2 answers

College Scorecard First Year Returning Students Calculation

How is Students Who Return After Their First Year calculated on the College Scorecard website? The closest field I can see in the Data Dictionary is ENRL_ORIG_YR2_RT, but that doesn't seem exactly to be it. I know at least some of the code is…
Evan Donovan
  • 200
  • 6
5
votes
1 answer

Why the 8-digit UnitID

In the raw data released with the Scorecard, some institutions have an 8 digit unitid (the one tied to IPEDS not the OPE IDs). Can someone explain why this is? I can see they tend to be branch campuses. So is it a case of the institution has its…
Jon
  • 51
  • 1
5
votes
3 answers

All Demonym's in their native language

COUNTRY - ? / ? US - American / Americans UK - English / English DE - Deutscher / Deutsche Anyone an idea where to find this? We're trying to find any sort of source for this for weeks now.
Tobi
  • 103
  • 9
5
votes
2 answers

Worldwide holidays, and their names in the local dominant language

Might be slightly OT here but I'll give it a try since the topic is probably well known to many here: We're looking for a database or system to get current and upcoming holidays by location, worldwide, and in english and in each of the countries…
Tobi
  • 103
  • 9
5
votes
1 answer

How do I download a Socrata graph or map definition I have created?

When I create a custom view, a map, chart, graph or report, and save it, the definition of that view is stored somewhere on Socrata. How is the definition stored? Is it a schema? How can I download the definition? How can I upload a modified version…
rlh100
  • 51
  • 3
5
votes
0 answers

I'm trying to find the amount of federal funding allocated to US cities since 1970. Blocks grants and earmarked funding

I'm looking at changes in federal spending on cities since 1970, in ten-year intervals. Ideally, this data would be in two parts: 1) payments to cities that they could spend as they wish and 2) total federal support for cities including grants for…
Patricia
  • 51
  • 1
5
votes
1 answer

Publicly Available 'English Opinion Lexicons' Txt

I'm an undergraduate and i working on a sentiment analysis project on email data. my first task is to do an opinion mining on the data-set. I train the data with two separate 'English opinion lexicon' data (positive and negative of course ). but the…
Miller
  • 281
  • 1
  • 8
5
votes
1 answer

Where can I download US secondary (high school, etc) educational test score data for school districts, individual schools, etc?

This has so far been a surprising difficult task. It appears some aspect of public policy sees it fit to make this data difficult to obtain(?) I'm sure its out there though. No child left behind data, SAT/ACT score statistics for school districts,…
boulder_ruby
  • 478
  • 3
  • 13
5
votes
1 answer

USA basement map or data available

I am looking for information about basement for each state or in USA. They can be either a map, dataset, or on the website. i am looking for that to get an information about a local in the state of Colorado. My understanding is that each state has…
PROBERT
  • 1,295
  • 8
  • 11
5
votes
3 answers

Automobile data including weight, engine output

Is there an open data set of car metrics? I'm looking to combine these with other data such as accidents, thefts, etc. I imagine that these types of data are readily available within auto insurance companies to calculate rates, and I was hoping an…
Megatron
  • 221
  • 1
  • 3
5
votes
3 answers

Where can I find some publicly available dataset for retail/grocery store companies?

I am looking for some publicly available dataset for retail/grocery store companies which (preferably) includes data about there stores, number of employees and operations. I tried to look around but couldn't find any dataset related to…
user2966197
  • 261
  • 2
  • 3