I know how to find an img tag within a string but I need to exclude any img tag with gif extension in it. How do I use the negative in my preg_match? I only need the first image tag which does not contain .gif extension.
I currently have this:
$text = html_entity_decode($text, ENT_QUOTES, 'UTF-8');
$pattern = "/<img[^>]+\>/i";
preg_match($pattern, $text, $matches);
$text = $matches[0];
$text will give me the first tag, for e.g. <img src="something.gif" border="0" />
However, I do not want to accept .gif, so if the first is a gif, it will skip it and continue searching for other .
Please advise me how to change my code to it.
Thanks a bunch!
Don't do it that way. Attempting to parse HTML with regex is a task doomed to failure, since a slight increase in the complexity of the HTML or the requirement will make your regex unbelievably complicated.
The best way is to use a tool designed for the task: the DOMDocument
class.
$dom = new DOMDocument;
$dom->loadHTML($text);
$images = $dom->getElementsByTagName('img');
foreach ($images as $image) {
if (!substr($image->getAttribute('src'), -4) === '.gif') {
break;
}
}
// $image is now the first image that didn't end with .gif
Try changing your pattern to something like this if you still want to use regular expression.
<?php
$text = '<img src="something.jpg" ';
$pattern = '/<img\s+src="(([^"]+)(.)(jpeg|png|jpg))"/';
preg_match_all($pattern, $text, $out);
echo '<pre>';
print_r($out);
?>
Try this :
<?php
$text = '<img src="something.jpg" ';
preg_match('/src="(?P<image>.*\.(jpeg|png|jpg))"/', $text, $matches);
echo $matches['image'];
?>