0

Hi I have a html like

<html>
   <head>
     <title>
          Some title
   </title>
</head>
<body>
    <div id="one">         some sample info </div>
</body>
</html>

How can I remove white spaces in this html except those in contents and within the tags using some regex using preg_replace? so to get something like this

<html><head><title>Some title</title></head><body><div id="one">some sample info</div></body></html>

please can anyone help me with this?? :)

Shades88
  • 7,398
  • 21
  • 80
  • 126

1 Answers1

5

You can replace (?<=>)\s+(?=<)|(?<=>)\s+(?!=<)|(?!<=>)\s+(?=<) with empty strings.

Edit: There's a simpler form: replace (?<=>)\s+|\s+(?=<)

Simply spoken, this regex will replace a group of one or more whitespaces if it has a > to the left or a < to the right.

It actually has two parts joined by OR (symbol: |), so either one may match:

  1. (?<=>)\s+ - this will match one or more whitespaces (\s+ in the regex), if it is preceded by a < (in regex: (?<=>)).

  2. \s+(?!=<) - this will match one or more whitespaces if it is followed by a < (in regex: (?!=<))

Learn more about regex.

Sufian Latif
  • 12,648
  • 3
  • 32
  • 70