比foreach更快的查询

I have a simple task to update id's from one table into another. However my current solution is very slow and is running with both php and mysql. The Sector table have about 9000 records and the source table have 17Mil. $sector->code is a varchar

So far I got this:

foreach(Sector::orderBy('id','DESC')->get() as $sector){
        DB::select(DB::raw("UPDATE `sources` SET `sector_id` = $sector->id WHERE `sector` = '$sector->code'"));
    }

This is taking about a minute for each sector and if running them all will complete in a week or something.. which is not an option.

Firstly the databases should have indices on them. the 17Mil entry, needs to have an index for sector. That will mean for each sector, then only the subset of records which match the sector will be checked.

Secondly, this should be achievable in a single query.

UPDATE `sources`,`sector` SET `sources`.`sector_id` = `sector`.`id` 
        WHERE `sources`.`sector` = `sector`.`code`;

This stops multiple transitions of the tables, and should also speed up the update.

References : - MySql : Update syntax - how to do a multi-table query in mySql.

I think there is 2 solutions for the problem..

Solution 1

the most optimum and best-fit solution for your problem is to create a cron job , and execute periodically (preferably on the night). it will solve your headache.

here is the link for managing cron jobs https://code.tutsplus.com/tutorials/managing-cron-jobs-with-php--net-19428.

Solution 2

Solution 2 is optimize your current code , in coding standard there is a rule which states that never write query's inside a loop .

step 1 : Try updating db using single query like this

 UPDATE sources
 SET sector_id= (CASE id WHEN 1 THEN 'XXX'
                 WHEN 2 THEN 'XXXX'
                 WHEN 3 THEN 'XXXX'
         END)
WHERE id IN(1,2 ,3);

There are many reasons why UPDATEing 17M rows is slow. One of them is that InnoDB will build the information to ROLLBACK the change if a crash occurs. This is especially costly for such a big table.

I would consider "chunking" the table into 1K or 10K rows at a time, scanning through based on (preferrably) the PRIMARY KEY. Here is more discussion of such.

Keep in mind that the table will be somewhat blocked from other usage during such a big update. Will you be stopping all other activity during the update (however it is implemented)? If no, have you though through the ramifications of a partially updated table, and/or the code changes that may be needed to see the sector in a different way (id vs code)?