I'm trying to parse a string containing the dump of a PHP file, to match all the occurrences of the PHP native function file()
. I'm using preg_match_all
with a REGEX that is not fully working as expected.
In fact, because I'm looking to match all the occurrences of the file()
function, I do not want to match results like $file()
, $file
or is_file()
.
This is the PHP code on which I'm trying to match all the occurrences of file()
:
<?php
$x = file('one.php');
file('two.php');
//
function foo($path)
{
return file($path);
}
function inFile()
{
return "should not be matched";
}file('next_to_brackets.php');
foo('three.php');
file('four.php'); // comment
$file = 'should not be matched';
$_file = 'inFile';
$_file();
file('five.php');
The REGEX I'm using is the following:
/[^A-Za-z0-9\$_]file\s*\(.*?(
|$)/i
[^A-Za-z0-9\$_] Starts with anyting except for letters, numbers, underscores and dollars.
file Continue with "file" word.
\s* Capture any space after "file" word.
\( After the spaces there should be an opening parenthesis.
.*? Capture any list of characters (the arguments).
(
|$) Stop capturing until a new line or the end of haystack is found.
/i Used for case-insensitive matches.
With this PHP code for testing the result:
preg_match_all('/[^A-Za-z0-9\$_]file\s*\(.*?(
|$)/i', $string, $matches);
print_r($matches[0]);
/*
//Prints:
Array
(
[0] => file('one.php');
[1] => file($path);
[2] => }file('next_to_brackets.php');
[3] =>
file('four.php'); // comment
[4] =>
file('five.php');
)
*/
For some reasons, my REGEX is not returning the second occurrence of file('two.php');
when this is a valid function, not a variable. It is definitely caused by the fact that it's right below another occurrence of the match ($x = file('one.php');
).
Any suggestions on how to match exact PHP functions in a string containing PHP code?
Thank you!