正则表达式在php中捕获函数参数

Currently I am trying to get all function parameters with regex for templating. The function parameters will be much like PHP compitable.

So here is the sample text needed to be parsed:

"test", 'test2', $test3, ?"%A %d %B %Y", "foo,bar,foobar"

This needed to be parsed to:

[
    '"test"',
    '\'test2\'',
    '$test3',
    '?"%A %d %B %Y"',
    '"foo,bar,foobar"'
]

I found this pattern but. When it has commas (,) in between double quotes it splits them too.

'~([^,]+\(.+?\))|([^,]+)~x'

The result of this pattern is:

[
    '"test"',
    ' \'test2\'',
    ' $test3',
    ' ?"%A %d %B %Y"',
    ' "foo,',
    'bar,',
    'foobar"'
]

I am not very good with regex patterns. I can achieve basic things with it but I couldn't find a way to achieve this.

Your regex does not handle many things, and one of them is the double quoted value after a space. As the beginning has no optional whitespace pattern, the last alternative triggers, and stops at a comma.

You can use the following regex:

\s*("[^"\\]*(?:\\.[^\\"]*)*"|'[^'\\]*(?:\\.[^\\']*)*'|[^,]+),?

See the regex demo, and grab Capture Group 1 values.

Here is an IDEONE demo:

$re = '~\s*("[^"\\\\]*(?:\\\\.[^\\\\"]*)*"|\'[^\'\\\\]*(?:\\\\.[^\\\\\']*)*\'|[^,]+),?~'; 
$str = "\"test\", 'test2', \$test3, ?\"%A %d %B %Y\", \"foo,bar,foobar\""; 
preg_match_all($re, $str, $tokens);
print_r($tokens[1]);

Pattern explanation:

  • \s* - zero or more whitespaces
  • ("[^"\\]*(?:\\.[^\\"]*)*"|'[^'\\]*(?:\\.[^\\']*)*'|[^,]+) - either of the 3 alternatives:
    • "[^"\\]*(?:\\.[^\\"]*)*" - double quoted strings (supporting escaped sequences)
    • | - or
    • '[^'\\]*(?:\\.[^\\']*)*' - single quoted strings (supporting escaped sequences)
    • | - or
    • [^,]+ - 1+ characters other than a comma
  • ,? - an optional comma (? - one or zero occurrences)