0

This code is very slow:

DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
dbFactory.setNamespaceAware(false);
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();

org.w3c.dom.Document doc = dBuilder.parse(new InputSource(new StringReader(content)));
System.out.println(doc.toString()); // outputs [#document: null]

And not only that it's slow it does not even parse:

<!DOCTYPE en-note SYSTEM "http://xml.evernote.com/pub/enml2.dtd">
<en-note>
    <div>
        <br />
    </div>
    <div>
        <br />
    </div>
    <en-media hash="" type="application/octet-stream" />
    <div>
        <br />
    </div>
    <div>This is my first Evernote blog with image/photo attached.</div>
    <div>
        <br />
    </div>
    <div>This is another line. </div>
    <div>
        <br />
    </div>
    <div>Some 
        <i>formatting </i>also for 
        <b>some </b>lines. 
    </div>
</en-note>

What is the fastest way to parse this XML into a org.w3c.dom.Document?

quarks
  • 31,298
  • 67
  • 266
  • 478
  • 3
    What does "very slow" mean? How slow is that? --- What does "does not even parse" mean? Did you get an error? If so, show it. – Andreas Jan 21 '20 at 22:46
  • 2
    Add `dbFactory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);` so it doesn't go online to lookup the DTD's, which is the slow part. – Andreas Jan 21 '20 at 22:54
  • 3
    *FYI:* Output `[#document: null]` means that the parse worked fine. See: [Getting document as null \[#document: null\] After parsing XML in java using DocumentBuilder](https://stackoverflow.com/a/5698623/5221149) – Andreas Jan 21 '20 at 22:55
  • It works fast now with `dbFactory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);` – quarks Jan 21 '20 at 23:03

0 Answers0