Currently I am trying to get all function parameters with regex for templating. The function parameters will be much like PHP compitable.
So here is the sample text needed to be parsed:
"test", 'test2', $test3, ?"%A %d %B %Y", "foo,bar,foobar"
This needed to be parsed to:
[
'"test"',
'\'test2\'',
'$test3',
'?"%A %d %B %Y"',
'"foo,bar,foobar"'
]
I found this pattern but. When it has commas (,) in between double quotes it splits them too.
'~([^,]+\(.+?\))|([^,]+)~x'
The result of this pattern is:
[
'"test"',
' \'test2\'',
' $test3',
' ?"%A %d %B %Y"',
' "foo,',
'bar,',
'foobar"'
]
I am not very good with regex patterns. I can achieve basic things with it but I couldn't find a way to achieve this.
Your regex does not handle many things, and one of them is the double quoted value after a space. As the beginning has no optional whitespace pattern, the last alternative triggers, and stops at a comma.
You can use the following regex:
\s*("[^"\\]*(?:\\.[^\\"]*)*"|'[^'\\]*(?:\\.[^\\']*)*'|[^,]+),?
See the regex demo, and grab Capture Group 1 values.
Here is an IDEONE demo:
$re = '~\s*("[^"\\\\]*(?:\\\\.[^\\\\"]*)*"|\'[^\'\\\\]*(?:\\\\.[^\\\\\']*)*\'|[^,]+),?~';
$str = "\"test\", 'test2', \$test3, ?\"%A %d %B %Y\", \"foo,bar,foobar\"";
preg_match_all($re, $str, $tokens);
print_r($tokens[1]);
Pattern explanation:
\s*
- zero or more whitespaces("[^"\\]*(?:\\.[^\\"]*)*"|'[^'\\]*(?:\\.[^\\']*)*'|[^,]+)
- either of the 3 alternatives:"[^"\\]*(?:\\.[^\\"]*)*"
- double quoted strings (supporting escaped sequences)|
- or'[^'\\]*(?:\\.[^\\']*)*'
- single quoted strings (supporting escaped sequences)|
- or[^,]+
- 1+ characters other than a comma,?
- an optional comma (?
- one or zero occurrences)