0

In my rails application, I need to upload some doc/xls files and parse its structure and get information. How can I get data from *.doc or *.xls in maybe xml format or anything else that I can read and parse?

sawa
  • 160,959
  • 41
  • 265
  • 366
itdxer
  • 1,206
  • 1
  • 12
  • 38

3 Answers3

1

You can parse different types of spreadsheets using the Roo gem. It supports:

  • OpenOffice
  • Excel
  • Google spreadsheets
  • Excelx
  • LibreOffice
  • CSV

From my experience it has some issues with parsing .xls files, however parsing .xlsx files is good.

As for .doc files, you may try using msworddoc-extractor gem or try one of the solutions proposed here.

Update: working with *.docx files - docx and docx-html

trushkevich
  • 2,617
  • 1
  • 26
  • 36
0

Have you seen the Nokogiri gem? http://nokogiri.org/

Very useful for xml parsing

grenierm5
  • 186
  • 4
  • 14
0

The spreadsheet gem is nice for excel and csv files. https://github.com/zdavatz/spreadsheet

aarti
  • 2,685
  • 1
  • 22
  • 30
  • I was use it and get this problem http://stackoverflow.com/questions/19915887/ruby-roo-loaderror-cannot-load-such-file-spreadsheet-note – itdxer Nov 11 '13 at 23:41