5

I've been looking at a few different ePub3s (programatically, not through ePub-specific software), and notive that the navigation document containing the TOC doesn't seem to be constantly named.

Is there a way of finding it, other than through reading each file for the relevant HTML?

Thanks

Edit: Digging around further, perhaps the package.opf is the answer: <item id="nav_1" href="nav.xhtml" media-type="application/xhtml+xml" properties="nav"></item>?

1 Answers1

1

I'll post an answer which seems to be working for me.

I found relevant documentation here: http://www.idpf.org/epub/30/spec/epub30-publications.html#sec-package-documents

Using Ruby and Nokogiri, I decompressed the ePub file, read the package document as HTML, then used an XPath statement:

unzipped_file   = Zip::File.open(epub_file)
package_file    = unzipped_file.glob("*/package.opf").first
package_as_html = Nokogiri::HTML(package_file.get_input_stream.read)
package         = package_as_html.at_xpath("html/body/package")

nav_file_name   = package.
                    at_xpath("manifest/item[@properties='nav']").
                    attribute("href").
                    text