I'm reporting on appointment activity and have included a function to export the raw data behind the KPIs. This raw data is stored as a CSV and I need to check for potentially duplicate consultations that have been entered.
Each row of data is assigned a unique visit ID based on the patients ID and the appointment ID. The raw data contains 30 columns of data, the duplicate check only needs to be performed on 7 of these. I have imported the CSV and created an array as below for first record and then append rest on.
$mds = array(
$unique_visit_id => array(
$appt_date,
$dob,
$site,
$CCG,
$GP,
$appt_type,
$treatment_scheme
)
);
What I need is to scan the $mds
array and return an array containing just the $unique_visit_id
for any duplicate arrays.
e.g. keys 1111, 2222 and 5555 all references arrays that contain the same value for all seven values, then I would need 2222 and 5555 returned.
I've tried search but not coming up with anything that is working.
Thanks
This is what I've gone with, still validating (data set is very big) but seems to be functioning as expected so far
$handle = fopen("../reports/mds_full_export.csv", "r");
$visits = array();
while($data = fgetcsv($handle,0,',','"') !== FALSE){
$key = $data['unique_visit_id'];
$value = $data['$appt_date'].$data['$dob'].$data['$site'].$data['$CCG'].$data['$GP'].$data['$appt_type'].$data['$treatment_scheme'];
$visits[$key] = $value;
}
$visits = asort($visits);
$previous = "";
$dupes = array();
foreach($visits as $id => $visit){
if(strcmp($previous, $visit) == 0){
$dupes[] = $id;
}
$previous = $visit;
}
return $dupes;