I'm dealing with some poorly-formatted PHP code (random indentation, non-standardized line breaks, spacing, etc). I sometimes take the time to clean it up, sometimes necessarily for readability.
I'm only changing whitespace (indentation, line breaks, spacing), and not fixing or optimizing logic -- not changing syntax. But, I sometimes worry that I've done more than I've thought I've done, and often times there isn't a good way to test.
Is there some way I can test that the new, re-formatted file is the same, executionally-wise, as the old? I can't use hashes, because they would change on account of my whitespace changes.
I was thinking that, since PHP is compiled at some point, and I could compare the two compiled versions. I understand that the compiler ignores whitespace.
I wrote a script to compile with PECL's bcompiler:
<?php
$fh = fopen($argv[2], "w");
bcompiler_write_header($fh);
bcompiler_write_file($fh, $argv[1]);
bcompiler_write_footer($fh);
fclose($fh);
$> cat testa.php
<?php
function test($a) {
echo "Hello!";
}
test();
$> cat testb.php
<?php
function
test ( $a )
{
echo
"Hello!";
}
test();
but examining the diff (after passing through xxd
) gave differences.
$> diff testa.php.binhex testb.php.binhex
19c19
< 0000120: 0000 0000 0000 0000 0300 0000 0000 0000 ................
---
> 0000120: 0000 0000 0000 0000 0400 0000 0000 0000 ................
25c25
< 0000180: 0004 0000 0000 0000 0028 0800 0000 0000 .........(......
---
> 0000180: 0008 0000 0000 0000 0028 0800 0000 0000 .........(......