长时间运行的控制台命令会变慢

Initial text

I created console in Laravel project. The command takes html data from one table, checks two pattern matches for each record by preg_match. If it returns true, updates are being done to other table's record that has the same attribute as record from the first table that is currently in focus in foreach loop. Number of records is cca 3500

After cca 150 iterations, command dramatically slows down, and I need one day for getting the command done.

I read all similar issues from this forum but they didn't help me. Not even the answer about forcing garbage collection.

Code is like following:

$ras = RecordsA::all();
$pattern = '/===this is the pattern===/';
foreach($ras as $ra){
    $html = $ra->html;
    $rb = RecordB::where("url", $ra->url)->first();
    $rb->phone = preg_match($pattern, $html, $matches) ? $matches[1] : $rb->phone;
    $rb->save();
}

I was searching for possible issue about preg_match performance but it was unsuccessful.

Did anybody meet such problem?


For MMMTroy update

I forgot to say I also tried custom but similar to your code:

$counter = DB::select("select count(*) as count from records_a")->first();
//Pattern for Wiktor Stribiżew :)
    $pattern = '/Telefon:([^<])+</';
for($i = 0; $i < $counter->count; $i+=150){
    $ras = RecordsA::limit(150)->offset($i);
    foreach($ras as $ra){
        $html = $ra->html;
        $rb = RecordB::where("url", $ra->url)->first();
        $rb->phone = preg_match($pattern, $html, $matches) ? $matches[1] : $rb->phone;
        $rb->save();
    }
}

"Pagination via OFFSET" is Order(N*N). You would be better off with Order(N), so "remember where you left off".

More discussion.

There is a good chance you a running out of memory. Laravel has a handy method to "chunk" results which dramatically reduces the amount of memory by limiting the amount of items you are looping. Try something like this.

$pattern = '/===this is the pattern===/';    
Records::chunk(100, function($ras)use($pattern){
    foreach($ras as $ra){
        $html = $ra->html;
        $rb = RecordB::where("url", $ra->url)->first();
        $rb->phone = preg_match($pattern, $html, $matches) ? $matches[1] : $rb->phone;
        $rb->save();
    }
});

What this is doing is grabbing 100 records at a time, and then looping through those. Once done, it creates an offset and grabs the next records in the database. This will prevent the entire loop from being stored in memory.

Does your database grow while looping through? What happens if RecordB is not found and it returns null? Feels to me your table for RecordB is growing, causing the search query to slow down.

Had recently similar problems and hitting memory limits. There is 1 thing whats the number 1 of slowing down stuff and leaking memory. The DB::$queryLog (disable it with: DB::disableQueryLog();). Everytime there is a query called, the query string will be stored in a variable.

Perhaps one of those things is causing it, but else the code looks fine to me.