4

I need a regex that could remove the full tag from start to end.
For eg.:
For the given string:

var str = "Hello <script> console.log('script tag') </script> World";

I need an output:

"Hello  World" // with full script tag including inner content removed

I am very specific for only the RegEx solution, so don't need browser append tricks.
Kindly notify if this is not possible at all.

I tried this and its variations:

inputString.replace( /<\/?[^>]+(>|$)/g, "" );

But this is not achieving what I want and is removing only the tag elements, leaving the inner content. What RegEx groups should I create in the expression?

I do not need to address stuff like type="text/javascript", as they are already filtered before I receive the string. No jQuery plz. (I need to store the RegEx as a property to my filter object).

Help appreciated.

Jongware
  • 21,685
  • 8
  • 47
  • 95
Om Shankar
  • 7,899
  • 3
  • 32
  • 54
  • 1
    Required read for everyone who asks about parsing HTML with regular expressions: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – Philipp Oct 28 '13 at 10:23
  • 3
    Have bookmarked that answer long back in my Stack folder, don't want to simply increase its view. I don't need a Chuck Norris RegEx for matching tags. My conditions are limited. Given a limited set of conditions, it can be done! – Om Shankar Oct 28 '13 at 10:28

1 Answers1

4

This is pure regex solution:

var str = "Hello <script> console.log('script tag') </script> World";
var repl = str.replace(/<([^.]+)>.*?<\/\1>/ig, '');
//=> "Hello  World"

with an assumption that there is no < OR > between opening and closing tags.

anubhava
  • 713,503
  • 59
  • 514
  • 593