当遇到错误时,Cowardly PHP脚本就会退出

I have a cURL function that spiders all the webpages specified in an array. The array is called $to_be_spidered, I have the function being executed like so:

$to_be_spidered = array('http://google.com', 'http://mysterysite.com', 'http://yahoo.com');

for ($i = 0; $i != count($to_be_spidered); $i++) {

        $ch = curl_init();
        curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
        curl_setopt($ch, CURLOPT_URL,$target_url);
        curl_setopt($ch, CURLOPT_FAILONERROR, true);
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
        curl_setopt($ch, CURLOPT_AUTOREFERER, true);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
        curl_setopt($ch, CURLOPT_TIMEOUT, 0); // set cURL timeout
        $html= curl_exec($ch);

        // error handling
        if (!$html) {
                echo "<br />cURL error number:" .curl_errno($ch);
                echo "<br />cURL error:" . curl_error($ch);
                exit;
        }

// etc. etc...

    }

Now the problem is, if a webpage returns an error like a 404, the script is killed. For example if mysterysite.com is not found the script does not attempt to spider yahoo.com. It just quits that and all the links after.

I would like it to quit attempting to spider the error link and move on to the next link in the queque. I tried changing "exit" to "continue" but no luck. It still stops. Am I doing something wrong or is this specific to using cURL?

You should change exit to continue as indicated.

Are you receiving any errors? Is error reporting enabled? A fatal error will halt execution.

Put this at the top of your script

ini_set('display_errors', 'On');
error_reporting(E_ALL);

Also, where are you using the URL from $to_be_spidered? Another thing (also related), your loop would look much nicer using foreach

foreach ($to_be_spidered as $target_url) {

exit() terminates the current script... so, don't use it if that's not the behavior you're looking for.

if (!$html) {
    echo "<br />cURL error number:" .curl_errno($ch);
    echo "<br />cURL error:" . curl_error($ch);
} else {
    // etc. etc...
}

The two previous suggestions will work. However I noticed another potential bug in the code.

From http://php.net/manual/en/function.curl-exec.php

"if the CURLOPT_RETURNTRANSFER option is set, it will return the result on success, FALSE on failure."

So if curl_exec returns data that equals the empty string or zero (or anything else determined as FALSE in http://php.net/manual/en/language.types.boolean.php), this script will incorrectly read it as an error.

So you'll need to make sure you check the type. The following should work:

if( $html===FALSE ) {
    // Report error
} else {
    // deal with content
}

Also I recommend wrapping each CURL request in a try catch loop as well.