I am trying to remove any comments embedded with the html file
$data= file_get_contents($stream); <br>
$data = preg_replace('<!--*-->', '', $data); <br>
echo $data;
I am still ending up with all the comments < !- bla bla bla -->
What am I doing wrong?
Regular expressions are very difficult to corral into doing what you want here.
To match arbitrary text in a regex, you need .*
, not just *
. Your expression is looking for <!-
, followed by zero or more -
characters, followed by -->
.
s/<!--[^>]*?-->//g
switch up regular expression
The below regex will remove HTML comments, but will keep conditional comments.
<!--(?!<!)[^\[>].*?-->
You should do this way:
$str = "<html><!-- this is a commment -->OK</html>";
$str2 = preg_replace('/<!--.*-->/s', '', $str);
var_dump($str2);
You could do it without using regular expression:
function strip_comments($html)
{
$html = str_replace(array("
<!--", "
<!--"), "<!--", $html);
while(($pos = strpos($html, "<!--")) !== false)
{
if(($_pos = strpos($html, "-->", $pos)) === false)
$html = substr($html, 0, $pos);
else
$html = substr($html, 0, $pos) . substr($html, $_pos+3);
}
return $html;
}
I know lots of answers are already posted. I have tried many but for me this regular expression works for multi line (in my case 40 line of comments) HTML comments removal.
$string = preg_replace("~<!--(.*?)-->~s", "", $string);
Cheers :)