XPath select all elements between two specific elements

Question

I have a following xml:

<doc>
    <divider />
    <p>text</p>
    <p>text</p>
    <p>text</p>
    <p>text</p>
    <p>text</p>
    <divider />
    <p>text</p>
    <p>text</p>
    <divider />
    <p>text</p>
    <divider />
</doc>

I want to select all p nodes after first divider element until next occurrence of divider element. I tried with following xpath:

//divider[1]/following-sibling::p[following::divider]

but the problem is it selects all p elements before last divider element. I'm not sure how to do it using xpath 1.

score 36 · Accepted Answer · answered Jun 02 '12 at 04:50

36

Same concept as bytebuster, but a different xpath:

/*/p[count(preceding-sibling::divider)=1]

answered Jun 02 '12 at 04:50

Daniel Haley

49,094
5
67
90

Great idea! `count` is more idiomatic as the `1` is used only once. – bytebuster Jun 02 '12 at 05:14

Dimitre Novatchev · Answer 2 · 2012-06-02T05:09:42.843

Here is a general XPath expression:

/*/divider[$k]
    /following-sibling::p
       [count(.|/*/divider[$k+1]/preceding-sibling::p)
       =
        count(/*/divider[$k+1]/preceding-sibling::p)
       ]

If you substitute $k with 1 then exactly the wanted p nodes are selected.

if you substitute $k with 2 then all p elements between the 2nd and 3rd divider , ..., etc.

Explanation:

This is a simple application of the Kayessian XPath 1.0 formula for node-set intersection:

$ns1[count(.|$ns2) = count($ns2)]

selects all the nodes that belong both to the nodesets $ns1 and $ns2.

In this specific case we substitute $ns1 with:

/*/divider[$k]/following-sibling::p

and we substitute $ns2 with:

/*/divider[$k+1]/preceding-sibling::p

score 8 · Answer 3 · edited Oct 05 '16 at 23:36

8

I think there's a much simpler and probably faster solution: you want all preceding siblings of the second divider that have at least one preceding sibling divider:

/doc/divider[2]/preceding-sibling::p[preceding-sibling::divider]

It gets a bit more complex, of course, if you want to find the paras between the second and third dividers: then you want something more like Daniel Haley's solution.

edited Oct 05 '16 at 23:36

Daniel Haley

49,094
5
67
90

answered Jun 02 '12 at 17:40

Michael Kay

147,186
10
83
148

Allows start and end divider tags to vary so, for instance, one may select items between an `H1` and the first `TABLE`. Simple and flexible. – vhs Apr 24 '20 at 20:56

score 5 · Answer 4 · answered Jun 02 '12 at 04:21

5

What about selecting all p having exactly one element divider as preceding-sibling ?

//doc/p[preceding-sibling::divider[1] and not (preceding-sibling::divider[2])]

answered Jun 02 '12 at 04:21

bytebuster

7,521
3
40
63

XPath select all elements between two specific elements

4 Answers4

Linked