Intro:
I need a function that can take an array and return a hash of it.
This should similar to spl_object_hash(), except that it returns a hash for given array.
So, far I've tried
function array_hash(array $array) {
return spl_object_hash((object) $array);
}
The problems
1) This algorithm isn't efficient by itself. For example, what if I pass something like this:
$array = array(
'foo' => 'bar',
'bool' => false,
'junk' => array(
'junk1' => array('foo' => array('__test__'))
)
)
It won't cast nested arrays to objects.
2) Another major problem is that, spl_object_hash()
returns a different hash for the same object on each new HTTP request.
The question
Again: I need a persistent hash for an array. Unlike spl_object_hash()
, the will be persistent on each HTTP request. How can I do this correctly?
How about serializing the array first?
md5(serialize($array));
The above answer by Martin works fine but what I do:
function array_signature($arr, $sort=true)
{
// Sorting helps generating a similar fingerprint for similar arrays
if($sort) {
array_multisort($arr);
}
// MD5 seems to be the fastest hashing function -- we don't care about collision for this
// JSON seems faster than serialize()
return md5(json_encode($arr));
}
Calling array_multisort()
first ensures that, for associative arrays, the same signature is returned for:
['a'=>1, 'b'=>2]
['b'=>2, 'a'=>1]
Benchmarks seem to agree: md5()
is the fastest hashing function in PHP. It has a lot of potential security issues so should not be used for hashing passwords, but for an array signature this should not be an issue.
Similarly, json_encode()
is the fastest encoding algorithm in PHP (faster than serialize()
)