从PHP中的任何网址获取所有图像? [关闭]

I have input type text for url on my website. By posting value of this url field, i want to fetch all possible images (if present) from that particular url as this happens in http://facebook.com at update status textarea. So what will be the code for this in php?

Thanks.

Facebook has the OpenGraph protocol. A lot of sites that you will link on Facebook will not render an image. This is because there is no configuration with og tags. There would be a very large amount of code required to really achieve any significant results of your crawled images.

There are many images that just aren't meant to be used that way such as spacer images, tracking images, etc... When you pull all image tags from a site you will get a number of these images that are mostly just dead space.

As always, there are multiple ways to approach this problem. They all begin with obtaining the source of the url. cURL is my preferred method to achieve this.

From there you need to parse the information in the source to find the source of the images. This can be done with regular expressions (regex) or my preferred method is to use the DOMDocument Class in PHP.

A brief example as to how to approach obtaining the source url from the image tags using the DOMDocument class is as follows:

// Load your HTML result into $response prior to here.
// Additionally, ensure that you have the root url for the
//     page loaded into $base_url.
$document = new DOMDocument();
$document->loadHTML($response);

$images = array();

// For all found img tags
foreach($document->getElementsByTagName('img') as $img) {
    // Extract what we want
    $image = array(
        // Here we take the img tag, get the src attribute
        //     we then run it through a function to ensure that it is not a
        //     relative url.
        // The make_absolute() function will not be covered in this snippet.
        'src' => make_absolute($img->getAttribute('src'), $base_url),
    );

    // Skip images without src
    if( ! $image['src'])
        continue;

    // Add to collection. Use src as key to prevent duplicates.
    $images[$image['src']] = $image;
}