I'm trying to read a list of items from a CSV file, compare it with the items in my database, and generate a newly one with the ones not in my base. From the CSV with thousand results, only 26 were not in db. However, the first item in my new CSV is present in my database, meaning it's a false positive. Only the first item is wrong, all the others are fine (I've queried them all).
Here is my code:
<?php
function generate_diff_csv() {
$conn = new mysqli("localhost","rcpp","*********", "items");
$key_ref = fopen("INV14.csv", "r");
$not_in = fopen("not_in.csv","w");
[...]
fclose($key_ref);
$keys = array();
foreach ($custom1 as $custom) {
$trimmed_custom = trim($custom);
$result = $conn->query("SELECT custom1 FROM products WHERE custom1 = '{$trimmed_custom}'");
if($result->num_rows == 0) {
$keys[] = array("key" => $trimmed_custom);
echo "adicionado ao csv...
";
}
}
foreach($keys as $key) {
fputcsv($not_in, $key);
}
fclose($not_in);
$conn->close();
}
generate_diff_csv();
To be sure I had everything right, I created a temporary table with the data I needed to compare. When I query it with an SQL, I get the 25 results. Putting them (PHP x SQL) side-by-side in a file, only the first is not a match, meaning it is really the only wrong result.
SELECT ref FROM refs WHERE ref NOT IN (SELECT custom1 FROM products);
Why is that? Why PHP returns the 1st key on my query?
The PHP is being executed from the command line, PHP 5.4.12 (Windows). I haven't tested on the Linux production environment, but I don't believe this would be a platform specific issue.
Thank you in advance.
I've solved the problem. And it was a platform problem, but not related with PHP (I think).
I was running the script through cmd, not Powershell, and the first item was getting an additional UTF BOM header on the first character. My fault was to not pay attention to the first output, thinking it was just cmd pritting some rubbish characters. But when I used var_dump($custom1), I could see that those characters were being put inside the variable and trim wasn't cleaning it.
The solution was to remove BOM from CSV input file.
Link reference (for the BOM character issue): https://superuser.com/questions/601282/is-not-recognized-as-an-internal-or-external-command --- first answer explains why 'cmd' outputs those characters.
Well, first of all, depending on the collation in the database, table and field, SQL Might be ignoring letter case, so for SQL, "something" is identical to "SoMEthIng". Also, since my practice with SQL CHAR() value types, it has come to my attention that suffix spaces are also ignored in comparison, at least when it comes to CHAR().