code:
$str = 'http://www.google.com <img src="http://placehold.it/350x150" />';
$str = preg_replace('/\b(https?):\/\/[-A-Z0-9+&@#\/%?=~_|$!:,.;]*[A-Z0-9+&@#\/%=~_|$]/i', '', $str);
echo $str;
output:
<img src="" />
i need this output:
<img src="http://placehold.it/350x150" />
how can i do it?
thanks for help.
I also think that DOMDocument
and DOMXPath
are preferable tools for parsing HTML markup.
But just in your particular case, here is solution with regexp negative lookbehind assertion :
$str = 'http://www.google.com <img src="http://placehold.it/350x150" /> http://www.google.com.ua';
$str = preg_replace('/(?<!src=\")(https|http):\/\/[^\s]+\b/i', '', $str);
print_r($str); // <img src="http://placehold.it/350x150" />
This will remove all urls excepting those which are inside an img
src attribute
Your pattern
/\b(https?):\/\/[-A-Z0-9+&@#\/%?=~_|$!:,.;]*[A-Z0-9+&@#\/%=~_|$]/i
removes all URLs (in the string) which start with protocol http
or https
. So when you apply it on your string, it will remove both that URL which is in the beginning of the string and that URL which is as src
of <img>
. So you have to use ^
in the beginning of your pattern:
$str = 'http://www.google.com <img src="http://placehold.it/350x150" />';
$str = preg_replace('/^\b(https?):\/\/[-A-Z0-9+&@#\/%?=~_|$!:,.;]*[A-Z0-9+&@#\/%=~_|$]/i', '', $str);
echo $str;
Or simply get what you need like this:
/(<img.*\/>)/i
Try:
<[^>]*(*SKIP)(*FAIL)|\b(https?):\/\/[-A-Z0-9+&@#\/%?=~_|$!:,.;]*[A-Z0-9+&@#\/%=~_|$]
The <[^>]*
catches all the things within an unclosed <
and (*SKIP)(*FAIL)|
skips them.
The rest is your regex.