0

Extract text from URL ?

trying this preg_match

/\<a href=([^"]*) .?\>([^\<\/a]*)\<\/a\>+/

Not working on

<a href="_first.asp?FileName=37479676820111216064143">        
<font size="2" face="Tahoma">
TEXT I WANT TO EXTRACT
</font>
</a>

am sure there's something wrong with ([^\<\/a]*) am just too bad in regex and can't find a good tutorial even !

Oliver Charlesworth
  • 260,367
  • 30
  • 546
  • 667
Rami Dabain
  • 4,581
  • 12
  • 59
  • 104

2 Answers2

0

In the very start, you have href=, then any number of non-quotes (which is zero in your example, since the next character is a quote), and then a space (which fails your expression, since the next character is not a quote, but a space).

In any case, while this is doable with regexps as long as the structure doesn't change, it's not really the way to do it.

Community
  • 1
  • 1
Amadan
  • 179,482
  • 20
  • 216
  • 275
0

Maybe:

/^<a[^>]+>(?:\s*<[^>]+>)*\s*([^<]+)(?:\s*<\/[^>]+>)*\s*<\/a>$/m

will work?

fge
  • 114,841
  • 28
  • 237
  • 319