too long

I am trying to remove any comments embedded with the html file

$data= file_get_contents($stream); <br>
$data = preg_replace('<!--*-->', '', $data); <br>
echo $data;

I am still ending up with all the comments < !- bla bla bla -->
What am I doing wrong?

  1. Regular expressions are very difficult to corral into doing what you want here.

  2. To match arbitrary text in a regex, you need .*, not just *. Your expression is looking for <!-, followed by zero or more - characters, followed by -->.

s/<!--[^>]*?-->//g

switch up regular expression

The below regex will remove HTML comments, but will keep conditional comments.

<!--(?!<!)[^\[>].*?-->

You should do this way:

$str = "<html><!-- this is a commment -->OK</html>";
$str2 = preg_replace('/<!--.*-->/s', '', $str);
var_dump($str2);

You could do it without using regular expression:

function strip_comments($html)
{
    $html = str_replace(array("
<!--", "
<!--"), "<!--", $html);
    while(($pos = strpos($html, "<!--")) !== false)
    {
        if(($_pos = strpos($html, "-->", $pos)) === false)
            $html = substr($html, 0, $pos);
        else
            $html = substr($html, 0, $pos) . substr($html, $_pos+3);
    }
    return $html;
}

I know lots of answers are already posted. I have tried many but for me this regular expression works for multi line (in my case 40 line of comments) HTML comments removal.

$string = preg_replace("~<!--(.*?)-->~s", "", $string);

Cheers :)