So basically I have a bunch of 1 Gig data files (compressed) with just text files containing JSON data with timestamps and other stuff.
I will be using PHP code to insert this data into MYSQL database.
I will not be able to store these text files in memory! Therefor I have to process each data-file line by line. To do this I am using stream_get_line().
Question Would it be faster to use Insert / Select / Update statements, or create a CSV file and import it that way?
Create a file thats a bulk operation and then execute it from sql?
I need to basically insert data with a primary key that doesnt exist, and update fields on data if the primary key does exist. But I will be doing this in LARGE Quantities.
Performance is always and issue.
Update The table has 22,000 Columns, and only say 10-20 of them do not contain 0.
I would load all of the data to a temporary table and let mysql do the heavy lifting.
create the temporary table by doing create table temp_table as select * from live_table where 1=0;
Read the file and create a data product that is compatible for loading with load data infile
.
Load the data into the temporary table and add an index for your primary key
Next Isolate you updates by doing a inner query between the live table and the temporary table. walk through and do your updates.
remove all of your updates from the temporary (again using an inner join between it and the live table).
process all of the inserts with a simple insert into live_table as select * from temp_table.
drop the temporary table, go home and have a frosty beverage.
This may be over simplified for your use case but with a little tweaking it should work a treat.