I’m working with the royal mail PAF database in csv format (approx 29 million lines), and need to split the data into files of approx 1000 lines.
I found this solution to write the files, but do not know how to a. open the file, and b. tell the script to delete the lines from the original file after copying them.
Can anyone advise?
I don't know the royal PAF database, but you open files with fopen()
, read a line with fgets()
and delete files with unlink()
.
Your found solution shows the idea of splitting every 1000 lines, but in your case there is no need for calling any csv
function at all. It's just a simple "copy each 1000 lines into new file".
$bigFile = fopen("paf.csv", "r");
$j = 0;
while(! feof($bigFile)) {
$smallFile = fopen("small$j.csv", "w");
$j++;
for ($i = 0; $i < 1000 && ! feof($bigFile); $i++) {
fwrite($smallFile, fgets($bigFile));
}
fclose($smallFile);
}
fclose($bigFile);
unlink("paf.csv");
Does it need to be in PHP? If you're on a Unix/Linux system, you can use the split
command.
split --lines=1000 mybigfile.csv