1

I just want to remove comments and white space from an html string before saving in DB. I don't want it to be repaired and add head tags etc.

I've spent hours searching for this but can't find anything, can someone who has done this tell me what config I need and which php tidy function will just "minify" and not try and make a valid html document from an html string?

RISC OS
  • 149
  • 11

2 Answers2

0

Below example may help you:

<?php
function html2txt($document){
$search = array('@<script[^>]*?>.*?</script>@si',  // Strip out javascript
               '@<[\/\!]*?[^<>]*?>@si',            // Strip out HTML tags
               '@<style[^>]*?>.*?</style>@siU',    // Strip style tags properly
               '@<![\s\S]*?--[ \t\n\r]*>@'         // Strip multi-line comments including CDATA
);
$text = preg_replace($search, '', $document);
return $text;
}
?> 

You can get more info on http://php.net/manual/en/function.strip-tags.php

Suresh Kamrushi
  • 14,655
  • 12
  • 74
  • 87
0

Can you try this,

below function is used to remove unwanted HTML comments & WhiteSpace,

      function remove_html_comments_white_spaces($content = '') {    

                  $content = preg_replace('~>\s+<~', '><', $content);
                  $content = preg_replace('/<!--(.|\s)*?-->/', '', $content);

            return $content;
        }

Even if you want to remove tags, then you can use PHP inbuilt function strip_tags();

Krish R
  • 22,188
  • 7
  • 49
  • 57