0

I need to use a multiline regex to extract part of a text, but I also have to allow those parts to be nested.

Here's an example of my scenario:

<!--BEGIN DELIMITER(A)-->
    sample text
    <!--BEGIN DELIMITER(B)-->
        another sample text
    <!--END DELIMITER(B)-->
<!--END DELIMITER(A)-->

The expected result would be to extract both blocks, including the begin/end lines as A1 and A2 are required for other things.

What happens is that I only get one match, which is the outer group.

Here's my regex:

<!--BEGIN DELIMITER\((?<expr>(\w+))\)-->(?<content>([\s\S]*?))<!--END DELIMITER\((\k<expr>.*)\)-->

Here's a regex101.com link for a quick visual reference.

Any idea on how to match both groups?

STT
  • 205
  • 1
  • 11
  • Like [this](https://regex101.com/r/4lep7o/1). Just wrap with a lookahead and a capturing group. – Wiktor Stribiżew May 09 '17 at 15:52
  • That is not regular. Thus you can't use **Regular** Expressions to match this. It's effectively a different ["HTML syntax", as far as parsing goes](http://stackoverflow.com/a/1732454/1045510). – Kroltan May 09 '17 at 15:52
  • You can however build a more sophisticated parser that uses RegEx just for the regular parts (detecting the start and end "tags"), and then build a tree out of it. Then you can select the contents however you want. – Kroltan May 09 '17 at 15:55
  • @Kroltan, you could, i suppose. Remember that the RegEx engine of .NET supports balancing groups... –  May 09 '17 at 15:55
  • @elgonzo Yes, but it gets surprisingly messy surprisingly fast, and isn't perfect either. When in the future you want to extend the pattern, it's not going to be any fun. – Kroltan May 09 '17 at 15:57
  • This link might help you : http://www.rexegg.com/regex-cookbook.html#captureparen2 – Gawil May 09 '17 at 15:57
  • 1
    @Kroltan, about that i fully agree with you :) –  May 09 '17 at 15:59

0 Answers0