I don't have access to an HTML parser on my server, so I need to do this via RegEx and PHP. I want to match all occurrences of linked images of a certain class within a large content string.
Here's a sample taken out of the larger content string that I want to match:
<a href='url'><img width="150" height="150" src="url" class="attachment-thumbnail" alt="Description" /></a>
This seems to match class="attachment-thumbnail"
(class=("|"([^"]*)\s)attachment-thumbnail("|\s([^"]*)"))
This seems to match everything from the opening HREF to the closing HREF, but it also gets other images in the larger content string that don't have class="attachment-thumbnail"
/(<a[^>]*)(href=)([^>]*?)(><img[^>]*></a>)/igm
How can I combine the above two to match only those HREFed images of class="attachment-thumbnail"?
Thanks for your help.
Try something like the following:
$html = '<a href="http://www.google.com"><img width="150" height="150" src="url" class="attachment-thumbnail" alt="Description" /></a>';
$doc = new DOMDocument();
$doc->loadHTML($html);
foreach($doc->getElementsByTagName('img') as $item)
{
$doc->saveHTML($item);
if ($item->getAttribute('class') == 'attachment-thumbnail')
{
echo $item->getAttribute('src');
}
}
To remove all elements that match the class 'attachment-thumbnail':
$html = '<a href="http://www.google.com"><img width="150" height="150" src="url" class="attachment-thumbnail" alt="Description" /></a>';
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
foreach($xpath->query('//div[contains(attribute::class,"attachment-thumbnail")]') as $elem)
{
$elem->parentNode->removeChild($elem);
}
echo $dom->saveHTML($doc->documentElement);