I have multiple txt files with a directory. The text files all contain the same header. I am reading in all the txt files and outputting it all to one file.
Since each individual file contains the same header, It is inserting all of them into the new merged file. How can I remove all the headers in the new merged file and leave one just at the top?
I have been looking at the sort command in unix.
sort filename | uniq
This command works, but removes all other data that is duplicate. Is there anyway to remove just the specific string "This is a header" but leave one at the top?
Current Code
$header = array( "XX-XXXXXXXXX-XXXXXXX-X XXXXXXXXXXXX" );
$files = glob( "/path/to/folder/*.txt" );
$output_file = "newfile_".date( "YmdHis" ).".txt";
$out = fopen( $output_file, "w" );
foreach( $header as $inputHeader ) {
fwrite( $out, $inputHeader );
}
foreach( $files as $file ) {
$in = fopen( $file, "r" );
while ( $line = fgets( $in ) ) {
if( $header !== $line ) {
fwrite( $out, $line );
}
}
fclose( $in );
}
fclose( $out );
Try inputting the header at the start of writing, then check for it later on when you are reading the lines
//cache our header lines
$header = "Header line";
$files = glob( "/path/to/files*.txt" );
//print_r($files);
$output_file = "newfile".date( "YmdHis" ).".txt";
$out = fopen( $output_file, "w" );
//input the header line at the top of our new file
fwrite( $out, $header);
foreach( $files as $file ) {
$in = fopen( $file, "r" );
while ( $line = fgets( $in ) ) {
//header check, dont output header lines to new file
if($header !== preg_replace('/\s+/', '', $line)){
fwrite( $out, $line );
}
}
fclose( $in );
}
fclose( $out );
After you create your new file add this line it will remove duplicated line
$lines = array_unique(file("your_file.txt"));
if the file only have 1 header
$header_exist = false;
foreach($files as $file) {
$in = fopen($file, "r");
while($line = fgets($in)) {
if(strpos($line, "This is a header") === false) {
fwrite($out, $line);
}
else {
if($header_exist === false) {
$header_exist = true;
fwrite($out, $line);
}
}
}
fclose($in);
}
So I managed to fix the Issue with help from @WillParky93. I had 4 different headers in the file with duplicates of all of them. after playing around with the logical operators.
Final Code
//the headers that were in the file with duplicates
$header1 = "DD-LLDRHD045-UHSTAYL-MR LOCKFMDLA111;
$header2 = "DD-LLDRHD045-UHSTAYL-MR LOCKFMDLA222";
$header3 = "DD-LLDRHD045-UHSTAYL-MR LOCKFMDLA333";
$header4 = "DD-LLDRHD045-UHSTAYL-MR LOCKFMDLA444";
//get all the files to be merged
$files = glob( "/PATH/TO/FILES/*.txt" );
//set the output filename
$output_file = "NewFile".date( "YmdHis" ).".txt";
//open the output file
$out = fopen( $output_file, "w" );
//loop through the files to be merged
foreach( $files as $file ) {
//open each file
$in = fopen( $file, "r" );
//while each line in each file
while ( $line = fgets( $in ) ) {
//if the current line is not equal to header1, header2, header3 or header4
if( preg_replace('/\s+/', '', $line ) !=
preg_replace('/\s+/', '', $header1 )&&
preg_replace('/\s+/', '', $line ) !=
preg_replace('/\s+/', '', $header2 )&&
preg_replace('/\s+/', '', $line ) !=
preg_replace('/\s+/', '', $header3 )&&
preg_replace('/\s+/', '', $line ) !=
preg_replace('/\s+/', '', $header4 ) ) {
//write that line to the output file
fwrite( $out, $line );
//echo $line."
";
}else{
//write blank line to the file
fwrite( $out, "
" );
}
}
//close the file
fclose( $in );
}
//close the output file
fclose( $out );
//get the contents of the output file
$header1 .= file_get_contents( $output_file );
//add the header to the top of the output file
file_put_contents( $output_file, $header1 );