I have this regex in PHP:
$regex = '/<img[^>]*'.'src=[\"|\'](.*)[\"|\']/Ui';
It captures all image tag sources in a string, but I want to only capture JPG files. I've tried to mess around with (.*) but I've only proven that I suck at regex... Right now I'm filtering the array but feels too much like a hack when I can just do it straight up with a proper match.
Try this:
$regex = '/<img ([^>]* )?src=[\"\']([^\"\']*\.jpe?g)[\"\']/Ui';
I also removed the extra |
in the character classes that was not needed.
Just need to search for the .jpg before the closing quotes I believe
$regex = '/<img[^>]*'.'src=[\"|\'](.*\.jpg)[\"|\']/Ui';
you have to be careful to escape '
since you are using it as PHP delimeter.
Also searching the file which end by .jpg
or jpeg
would make it.
$regex = '/<img[^>]*src=["\']([^\'"]*)\.(jpg|jpeg)["\'][^>]*>/Ui';
Try:
$regex = '/<img[^>]*'.'src=[\"|\'](.*[.]jpg)[\"|\']/Ui';
You all forgot that tags may have spaces between < and img
So a correct regexp should start with /<\s*img
First, get all img
tags with an HTML parser. Then, take those whose src
attribute's value is matched by the regex \.(jpeg|jpg)$
.
For example, using this parser:
$html = file_get_html('http://example.foo.org/bar.html');
foreach ($html->find('img') as $img) {
if (preg_match ("\.(jpeg|jpg)$", $img->src) {
//save $img or $img->src or whatever you need
}
}
Edit: I shortened the regular expression. You can also use \.jpe?g$
.