PHP:获取HTML网页的所有CSS文件

I'm trying to get all CSS files of an html file from URL.

I know that if I want to get the HTML code it is easy - just using PHP function - file_get_contents.

The question is - if I could search easily inside an a URL of HTML and get from there the files or content of all related CSS files?

Note - I want to build an engine for getting a lot of CSS files, this is why just reading the source is not enough..

Thanks,

You could try using http://simplehtmldom.sourceforge.net/ for HTML parsing.

require_once 'SimpleHtmlDom/simple_html_dom.php';

$url = 'www.website-to-scan.com';
$website = file_get_html($url);

// You might need to tweak the selector based on the website you are scanning
// Example: some websites don't set the rel attribute
// others might use less instead of css
//
// Some other options:
// link[href] - Any link with a href attribute (might get favicons and other resources but should catch all the css files)
// link[href="*.css*"] - Might miss files that aren't .css extension but return valid css (e.g.: .less, .php, etc)
// link[type="text/css"] - Might miss stylesheets without this attribute set
foreach ($website->find('link[rel="stylesheet"]') as $stylesheet)
{
    $stylesheet_url = $stylesheet->href;

    // Do something with the URL
}

You need to parse the HTML tags looking for CSS files. You can do it for example with preg_match - looking for matching regex.

Regex which would find such files might be like this:

\<link .+href="\..+css.+"\>