使用PHP从图像中提取alt和/或title属性

I use this to extract the src of the image or the full path of image.

preg_match_all('/\< *[img][^\>]*src *= *[\"\']{0,1}([^\"\'\ >]*)/',$content,$matches);

It works for me so far, I get an array of all images sources. I am trying to be greedy and capture the alt and title values from the image tags.

I know it is not recommended to use regex to do it, but I really need a quick solution. I do not want it to return an error if alt or title is missing from the image tag.

Any input is appreciated and apologies. I know it is easier and appropriate with parser, but since I could get the src with that preg match i thought i could get the alt and title too! :)

Thanks a lot, happy new year :D

Here's a solution using PHP's DOM parser:

$domd = new DOMDocument();
libxml_use_internal_errors(true);
$domd->loadHTML(file_get_contents("http://stackoverflow.com"));
libxml_use_internal_errors(false);

$items = $domd->getElementsByTagName("img");
$data = array();

foreach($items as $item) {
  $data[] = array(
    "src" => $item->getAttribute("src"),
    "alt" => $item->getAttribute("alt"),
    "title" => $item->getAttribute("title"),
  );
}

Use phpQuery, it does this easily.

http://code.google.com/p/phpquery/ (the good link)

Try this, this is the best I could come up with in 3 minutes...

if(preg_match_all('@<img(\s?(src|alt|title)="([^"]+)"\s?)?(\s?(src|alt|title)="([^"]+)"\s?)?(\s?(src|alt|title)="([^"]+)"\s?)?\/?>@si',$content,$m)){
$img_array = array(
    $m[2][0]=>$m[3][0],
    $m[5][0]=>$m[6][0],
    $m[8][0]=>$m[9][0]
    );}

print_r($img_array);