0

I used <a[^>]*?title=\"([^\"]*?\"[^>]*?> and found all Links with title tags. How can I find all Links with no title tags & title attribute? And how can I find all image ALT tags that are empty or have no ALT tag?

Jared Farrish
  • 47,157
  • 17
  • 93
  • 101
  • 1
    What happens if I define the title with the use of an SGML parsed entity? Huh? Huh? (HTML is far nastier than it appears to be at first glance, as Fredrik points out. **Use a proper parser.** We mean it.) – Donal Fellows May 27 '11 at 20:39

2 Answers2

2

See this classic post on SO RegEx match open tags except XHTML self-contained tags

Community
  • 1
  • 1
Fredrik Pihl
  • 42,950
  • 7
  • 81
  • 128
2

@Fredrik has you covered pretty well, I think, but here's an alternate general method for this kind of sophisticated find/replace in markup.

Since I'm no regex guru, I like to use jQuery + browser debugger tools + copy/pasting for this kind of thing. I view the page in Firefox(Chrome/dev tools works great, too), open up the Firebug console, and perform the actions in jQuery goodies, something like this:

$('a').each(function(){
  if ($(this).filter('[title]').length == 0) {
    //if there's no title attr
  } else if ($(this).attr('title') == "") {
    //if title is empty empty
  } 
  // etc.
});

// repeat pattern for imgs...

When you're done with your manipulations, copy the relevant section from the debugger (or just grab the whole <body>) and paste it back into your editor.

I find this method much easier to understand than regexes, but that's just because I'm not too bright. HTH.

peteorpeter
  • 3,967
  • 2
  • 26
  • 46