65

How do you quickly locate element/elements via xpath string on a given org.w3c.dom.document? there seems to be no FindElementsByXpath() method. For example

/html/body/p/div[3]/a

I found that recursively iterating through all the child node levels to be quite slow when there are lot of elements of same name. Any suggestions?

I cannot use any parser or library, must work with w3c dom document only.

Jim Garrison
  • 83,534
  • 20
  • 149
  • 186
KJW
  • 14,705
  • 45
  • 133
  • 241
  • https://stackoverflow.com/questions/45495758/detect-hyperlink-hover-in-webview-and-print-the-link – SedJ601 Jan 29 '18 at 20:52

1 Answers1

103

Try this:

//obtain Document somehow, doesn't matter how
DocumentBuilder b = DocumentBuilderFactory.newInstance().newDocumentBuilder();
org.w3c.dom.Document doc = b.parse(new FileInputStream("page.html"));

//Evaluate XPath against Document itself
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList)xPath.evaluate("/html/body/p/div[3]/a",
        doc, XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); ++i) {
    Element e = (Element) nodes.item(i);
}

With the following page.html file:

<html>
  <head>
  </head>
  <body>
  <p>
    <div></div>
    <div></div>
    <div><a>link</a></div>
  </p>
  </body>
</html>
hoat4
  • 1,111
  • 12
  • 9
Tomasz Nurkiewicz
  • 324,247
  • 67
  • 682
  • 662
  • In my code example `doc` is of `org.w3c.dom.Document` type. If you already have an instance of `Document`, just use two last lines of my code and that's it! P.S.: Why the downvote? – Tomasz Nurkiewicz Jun 30 '11 at 18:10
  • this returns text. I need domelement or domelements. – KJW Jun 30 '11 at 18:49
  • 1
    See my edit (introduction of `XPathConstants.NODESET` parameter) - now it returns `NodeList`. Also have a look at other constants as well. – Tomasz Nurkiewicz Jun 30 '11 at 19:43
  • Thank you this is a great answer. – KJW Jul 01 '11 at 03:03
  • @Tomasz Nukiewicz , can you please look into my implementation. I know I am not the the questioner and itz a different question, but I took the reference from your answer, so I hope u can help me,http://stackoverflow.com/questions/26389376/creating-a-xml-tree-in-java-and-convert-it-to-json-object – Sudip7 Oct 16 '14 at 15:46
  • I think you don't need to do `doc.getDocumentElement()`, you should be able to run the xpath on `org.w3c.dom.Document` type directly. – burcakulug Apr 08 '15 at 13:34