处理csv的最快方法,bash vs php vs c / c ++处理速度[关闭]

I have a csv with 5M rows. I have an option to import them at mysql database and then loop the table with php.

db_class=new MysqlDb;
$db_class->ConnectDB();
$query="SELECT * FROM mails WHERE .....";
$result=mysqli_query(MysqlDb::$db, $query);
while($arr=mysqli_fetch_array($result))
{
    //db row here 
}

So I loop all the mails from the the table and process them. IF they contain some bad string, I delete them etc.

This works but is very slow to import 5M rows, is also very slow to loop all of them one by one and edit the rows (delete when they contain bad string).

I am thinking of a better solution for skipping php/mysql at all. I will process the .csv file, line by line and check if the current row contains a specific bad string. I can do that In pure php, like:

$file = file('file.csv');
while (($data = fgetcsv($file)) !== FALSE) {
  //process line
   $data[0];
}

This is the bash script I use to loop all lines of a file

while read line; do    
    sed -i '/badstring/d' ./clean.csv
done < bac.csv

While on python I do

with open("file.csv", "r") as ins:
    array = []
    for line in ins:
      //process line here

A bad line would be like

name@baddomain.com
name@domain (without extension)

etc I have a few criterias for what a bad line is, thats why I didn't bother posting it here.

However for very big files I must try to find a better solution. What do you guys recommend? Should I learn how to do it in c/c++ or bash. Bash I know a little already, so I can make it faster. Is c/+++ much faster than bash for this situation? OR I should stick with bash?

Thank you

As for PHP solution, you are looking for fgetcsv. The manual includes the example of iterating the CSV file.

Or, if you want to be fancy, you can go with league/csv library.