7

For the purpose of my research, I need the Enron Email Dataset. According to the project's official website, there is an archive of emails represented in the set of separate TXT files, but the problem is this archive is not well organized and requires a lot of preparation work in order to be able to proceed with the data.

I looked for some ready-to-use well-organized archive of Enron Email Dataset in MySQL, but the only collection I found is the dump of MySQL database, which contains only two columns — message ID and content, no sender/receiver column, and no title column. My question is there any well and completely organized Enron Email Dataset in MySQL database format, where I can execute some SQL queries according to sender/receiver, where it is easy to retrieve a message title etc.

Orophile
  • 1,751
  • 4
  • 11
  • 30
Mike
  • 251
  • 1
  • 6

2 Answers2

8

After some additional search, I found the worked link among of lots, which point to 404 pages.
The worked ones are:
https://s3.amazonaws.com/rjurney_public_web/images/enron.mysql.5.5.20.sql.gz
http://www.ahschulz.de/pub/R/data/enron-mysqldump_v5.sql.gz

Detailed information about these datasets:
http://hortonworks.com/blog/the-data-lifecycle-part-one-avroizing-the-enron-emails/
http://www.ahschulz.de/enron-email-data/

P.S. Don't know why, but for me the download process was very slow.

Mike
  • 251
  • 1
  • 6
4

You might be able to create an SQL DB by using FERC Enron XERA web application: http://fercenron.omega-caci.com/default.html (user guide).

You'd use the search feature of XERA based on fields, and then export the result to a delimited file, then convert to SQL database for additional queries.

MrMeritology
  • 141
  • 2
  • You could also look at this: http://orchestrate.io/blog/2014/07/17/using-orchestrates-graph-search-with-the-enron-emails/ – MrMeritology Oct 09 '14 at 20:37