0

So far this captures everything i need ending with 'em' i need regex to capture paragraphs ending in 'ppp' also.

My regex:

%<h2>Storyline</h2>(.*)em%s
el_pup_le
  • 10,711
  • 22
  • 77
  • 130

1 Answers1

1

I would advise not to parse HTML with regex, but this seems easy enough seeing as you aren't actually parsing it as HTML...

%<h2>Storyline</h2>(.*?)(?:em|ppp)%s
BoltClock
  • 665,005
  • 155
  • 1,345
  • 1,328
  • Why shouldn't HTML be parsed with regex? – el_pup_le Jan 09 '11 at 11:44
  • 2
    If your HTML consists of a very simple format or one-liner then there is nothing wrong with using regex. However if the structure is unpredictable or large, you're much better off using a parser, like `DOMDocument`, which will handle all the parsing of the markup for you so you can focus on getting the information from the markup. – BoltClock Jan 09 '11 at 11:46
  • 2
    @aLk see [Best Methods to parse HTML](http://stackoverflow.com/questions/3577641/best-methods-to-parse-html/3577662#3577662) – Gordon Jan 09 '11 at 11:53