PHP循环抓取多个URL页面时,抓着抓着就停止了,该怎么办?

PHP抓取页面中途停止怎么办如果我下次想从停止的地方抓取的话该怎么做?

set_time_limit(0); 加上这个可以循环完

将遍历的页面的地址保存到数据库或者文件里。下次运行的时候,据此设置为循环开始的值。

#!/usr/bin/php
#--*-- coding: utf8 --*--
<?php
set_time_limit(0); 
error_reporting(E_ALL^E_NOTICE);
$nextUrl = "GEN.1";
while(!empty($nextUrl)){
    $userAgent = 'Mozilla/5.0 (Windows; U; Windows NT 5.2) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.2.149.27 Safari/525.13';
    $ch= curl_init();
    curl_setopt($ch, CURLOPT_URL,"https://wdbible.com/api/bible/chapterhtml/cunps/{$nextUrl}");
    curl_setopt($ch, CURLOPT_HEADER,0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($ch, CURLOPT_USERAGENT,$userAgent);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
    $data = curl_exec($ch);
//  curl_close($ch);
    if(!empty($data)){
        $data = json_decode($data,true);
        $content = $data['data']['content'];
    }else{
        echo "{$nextUrl}章节访问失败,重新访问。。。\r\n";
        $data = curl_exec($ch);
    }
    $file1 = "D:/cmd/xml/{$nextUrl}.xml";
    file_put_contents($file1,$content);
    echo "{$nextUrl}.xml生成成功。\r\n";
    $file = "D:/cmd/txt/{$nextUrl}.txt";
    $stack = array();
    $top = -1;
    $xmlParser = xml_parser_create();
    xml_set_element_handler($xmlParser,"Start","Stop");     
    xml_set_character_data_handler($xmlParser,"char");  
    $fp = fopen("$file1","r");  
    while($row = fread($fp,10000)){     
            xml_parse($xmlParser,$row) or 
                die(xml_error_string(xml_get_error_code($xmlParser), 
                xml_get_current_line_number($xmlParser)));
    }   
    xml_parser_free($xmlParser);
    echo "{$nextUrl}章节抓取成功。。。\r\n";
    $nextUrl = $data['data']['nextChapterUsfm'];
    if(!empty($nextUrl)){
        echo "读取下一章节。。。\r\n";
    }else{
        echo "下一章节路径获取不到,重新获取。。。\r\n";
        $nextUrl = $data['data']['nextChapterUsfm'];
    }
}
echo "抓取结束。。。。。。\r\n";  
function Start($parser, $element_name, $element_attr){
    global $top,$stack;
    if($element_name == "DIV" && count($element_attr) == 1){
        $top++;
        array_push($stack,$element_name);
        $top++;
        array_push($stack,$element_attr);
    }else{
        $top++;
        array_push($stack,$element_name);
    }
}
function Stop($parser, $element_name){  
    global $top,$stack,$file;
    switch($element_name){
        case "H6" :
            file_put_contents($file,"\r\n",FILE_APPEND);
            array_pop($stack);
            $top--;
            array_pop($stack);
            $top--;
            break;
        case "H5" :
            file_put_contents($file,"\r\n",FILE_APPEND);
            array_pop($stack);
            $top--;
            array_pop($stack);
            $top--;
            break;
        case "MARK" :
            array_pop($stack);
            $top--;
            break;
        case "SPAN" :
            array_pop($stack);
            $top--;
            break; 
        case "li" :
            file_put_contents($file,"\r\n",File_APPEND);
            array_pop($stack);
            $top--;
            break;
        case "DIV" :
            if($stack[$top] == "DIV"){
                array_pop($stack);
                $top--;
            }else{
                file_put_contents($file,"\r\n",FILE_APPEND);
                array_pop($stack);
                $top--;
                array_pop($stack);
                $top--;
            }
            break;
        case "P" :
            array_pop($stack);
            $top--;
    }
}
function char($parser, $data1){
    global $top,$stack,$file;
    if (strlen(trim($data1)) > 0){                  
        file_put_contents($file,$data1,FILE_APPEND);
    }   
}
?>




总是搞着搞着,下一个URL就访问不到了。。
图片说明

有一个这样的警告。。