提高php-cli中大型csv文件解析的性能

Good evening, I have a csv file of 400mb and I have to load it in a mysql database. the csv file is "irregular" because it contains information such as:

user|email|password

user|password|otherdata

I have made a script in php-cli to read the file line by line and take the information I need: username and password. l 'username is always the first record. to know what 'is the password I look at the length of the string.

I run the script five hours ago and still has not finished loading all the data in the database.

how can I do to improve the performance of this script?

<?php

$fileHandle = fopen("C:/Users/AT/Documents/Backup/forumusers.csv", "r");

$mysqlHandle = mysql_connect("localhost", "root", "");
mysql_select_db("testbackupboard");

while(!feof($fileHandle))
{
    $fileRow = fgets($fileHandle);
    $line2Record = explode("|", $fileRow);
    foreach ($line2Record as $rowRecord)
    {
        if (strlen($rowRecord) == 40)
        {
            $datatoMysql[0] = $rowRecord; // password hash
        }
    }
    $datatoMysql[1] = $line2Record[0]; // username

    $execQuery = mysql_query("INSERT INTO forumusers (username, hash) VALUES ('".mysql_real_escape_string(utf8_encode($datatoMysql[1]))."', '".mysql_real_escape_string(utf8_encode($datatoMysql[0]))."')");
    if($execQuery)
    {
        print"Record ".$rowRecord[1]." ".$rowRecord[0]." loaded into db
";
    }
    else
    {
        die(mysql_error());
    }
}
fclose($fileHandle);

?>

How about fgetcsv()?

$row = 1;
if (($handle = fopen("test.csv", "r")) !== FALSE) {
    while (($data = fgetcsv($handle, 1000, "|")) !== FALSE) {
        $num = count($data);
        echo "<p> $num fields in line $row: <br /></p>
";
        $row++;
        for ($c=0; $c < $num; $c++) {
            echo $data[$c] . "<br />
";
        }
    }
    fclose($handle);
}
  1. CREATE TABLE tmp with 4 columns: id AUTO_INCREMENT, user, col2, col3. This is not the final table.
  2. LOAD DATA for all the rows
  3. CREATE TABLE final (user, email, password, otherdata, PRIMARY KEY(user))
  4. Then do
INSERT INTO final (user, email, password, otherdata)
    SELECT a.user, a.col2 AS email, a.col3 AS password,
           b.col3 AS otherdata
        FROM tmp AS a
        JOIN tmp AS b ON b.id = a.id+1
        WHERE (a.id % 2) = 1;

(Caveat: This will not work if auto_increment_increment is not 1, such as on a Galera-based cluster. Instead, you would need an @variable to do the sequencing.)