I need to take two arrays and come up with a percent of similarity. ie:
array( 0=>'1' , 1=>'2' , 2=>'6' , 3=>array(0=>1))
vers
array( 0=>'1' , 1=>'45' , 2=>'6' , 3=>array(0=>1))
Where I would think that the % is 75
or
array( 0=>'1' , 1=>'2' , 2=>'6' , 3=>array(0=>'1'))
vers
array( 0=>'1' , 1=>'2' , 2=>'6' , 3=>array(0=>'55'))
Not sure how to approach this.. just need to end up with a workable float percent. Thank you .
Assuming both arrays are the same length, you can iterate through and see which values are the same for the keys, for example:
<?php
$a = array(1,2,3,4);
$b = array(1,2,4,4);
$c = 0;
foreach ($a as $k=>$v) {
if ($v == $b[$k]) $c++;
}
echo ($c/count($a))*100;
// outputs 75
?>
Or just checking whether they contain similar items using in_array
.
<?php
$a = array(1,2,3);
$b = array(1,2,4);
$c = 0;
foreach ($a as $i) {
if (in_array($i,$b)) $c++;
}
echo ($c/count($a))*100;
// outputs 66.66...
?>
Set a count to zero.
Iterate through the array, checking if each pair of elements are equal. If they are, increment the count.
At the end, the similarity is the count divided by the total number of elements in the arrays.
This assumes the arrays are the same length and have the same keys - defining "similarity" is difficult otherwise.
You could first of all count the number of total items. Then you need a function that tells you if one sub-item is the same or not (bool).
Then you go through both arrays at once and count the same matches. To get the percentage, divide the number of same by the total count from earlier and multiply the outcome with 100.
You need to decide how you would like to deal with elements that only exist in the one but not in the other array. Also if you want to go inside elements if those are an array as well, you could make the is_same($a, $b)
function recursive and returning a float value (0-1, not 0-100) and count that fraction instead of 0 FALSE or 1 TRUE.
count($array)
would give you the total number of elements in the array. Then you can compare the numbers in the array and have a counter for all the ones that are the same and do the [total number of same number/the count($array)] *100
. That should give the percentage
Here's an algorithm for it.
int count = 0;
for(int i = 0; i < arraySize; i++)
{
if(array1[i] == array2[i])
{
count++;
}
}
float percent = ((count/arraySize)*100);
Here's how I tackled this problem recently:
$array1 = array('item1','item2','item3','item4','item5');
$array2 = array('item1','item4','item6','item7','item8','item9','item10');
// returns array containing only items that appear in both arrays
$matches = array_intersect($array1,$array2);
// calculate 'similarity' of array 2 to array 1
// if you want to calculate the inverse, the 'similarity' of array 1
// to array 2, replace $array1 with $array2 below
$a = round(count($matches));
$b = count($array1);
$similarity = $a/$b*100;
echo 'SIMILARITY: ' . $similarity . '%';
// i.e., SIMILARITY: 40%
// (2 of 5 items in array1 have matches in array2 = 40%)