37

I have a following xml:

<doc>
    <divider />
    <p>text</p>
    <p>text</p>
    <p>text</p>
    <p>text</p>
    <p>text</p>
    <divider />
    <p>text</p>
    <p>text</p>
    <divider />
    <p>text</p>
    <divider />
</doc>

I want to select all p nodes after first divider element until next occurrence of divider element. I tried with following xpath:

//divider[1]/following-sibling::p[following::divider]

but the problem is it selects all p elements before last divider element. I'm not sure how to do it using xpath 1.

Mirko
  • 1,815
  • 2
  • 18
  • 16

4 Answers4

36

Same concept as bytebuster, but a different xpath:

/*/p[count(preceding-sibling::divider)=1]
Daniel Haley
  • 49,094
  • 5
  • 67
  • 90
21

Here is a general XPath expression:

/*/divider[$k]
    /following-sibling::p
       [count(.|/*/divider[$k+1]/preceding-sibling::p)
       =
        count(/*/divider[$k+1]/preceding-sibling::p)
       ]

If you substitute $k with 1 then exactly the wanted p nodes are selected.

if you substitute $k with 2 then all p elements between the 2nd and 3rd divider , ..., etc.

Explanation:

This is a simple application of the Kayessian XPath 1.0 formula for node-set intersection:

$ns1[count(.|$ns2) = count($ns2)]

selects all the nodes that belong both to the nodesets $ns1 and $ns2.

In this specific case we substitute $ns1 with:

/*/divider[$k]/following-sibling::p

and we substitute $ns2 with:

/*/divider[$k+1]/preceding-sibling::p
Dimitre Novatchev
  • 235,605
  • 26
  • 291
  • 421
8

I think there's a much simpler and probably faster solution: you want all preceding siblings of the second divider that have at least one preceding sibling divider:

/doc/divider[2]/preceding-sibling::p[preceding-sibling::divider]

It gets a bit more complex, of course, if you want to find the paras between the second and third dividers: then you want something more like Daniel Haley's solution.

Daniel Haley
  • 49,094
  • 5
  • 67
  • 90
Michael Kay
  • 147,186
  • 10
  • 83
  • 148
  • Allows start and end divider tags to vary so, for instance, one may select items between an `H1` and the first `TABLE`. Simple and flexible. – vhs Apr 24 '20 at 20:56
5

What about selecting all p having exactly one element divider as preceding-sibling ?

//doc/p[preceding-sibling::divider[1] and not (preceding-sibling::divider[2])]
bytebuster
  • 7,521
  • 3
  • 40
  • 63