正则表达式:拆分函数调用参数

Running into this problem and I've been searching for days. I'm using PHP to parse formulas for a platform.

A formula could be something like:

object.Field

ADD(object.NumberOfTHings, object.NumberOfThings)

object.DoSomething(ADD(object.NumberOfTHings, object.NumberOfThings), 'words!')

The idea is, it can nest many levels. Users can include quotes (double and single) as well.

I'm working on a function that will return each parameter at the highest level. So

object.DoSomething(ADD(object.NumberOfTHings, object.NumberOfThings), 'words!')

Will need to return the following array:

  • ADD(object.NumberOfTHings, object.NumberOfThings)
  • 'words!'

We then go back and parse each parameter appropriately (some are object calls, function calls, etc.). I'm open to parsing it all at once, but figured that would just be more complicated.

My current regex is as follows:

\(?'pullsinglequotes'\'.+?\')|(?'pulldoublequotes'\".+?\")|(?'pullfunctions'[^,]\(([^()]|(?R))*\))\

It MOSTLY works, but has two issues:

  1. Won't return objects yet (ex. if I reference object.Field as a parameter).
  2. Only includes the last letter of a function.

Here's a REGEXR with the issue: https://regexr.com/41e20

I've tried many different variations of REGEX and each has its downsides.

My question is: Does anyone have enough regex knowledge to solve those two issues? If so, any help would be greatly appreciated.

Update If anyone is interested, this following was my final regex.

/(?'pullsinglequotes'\'.+?\')|(?'pulldoublequotes'\".+?\")|(?'pullfunctions'\b[\w.]+\s*\(([^()]|(?R))*\))|(?'pullvars'\w+(?:\.\w+)?)/

Your pullfunctions is only matching one character that's not a , followed by a parens. Allow it to repeat and precede it with a word boundary.

For the vars and objects, just use a repeating word character with an optional dot-separated part. You can adjust this to a character group to allow other characters like - or _.

Full regex:

(?'pullsinglequotes'\'.+?\')|(?'pulldoublequotes'\".+?\")|(?'pullfunctions'\b[\w]+\s*\(([^()]|(?R))*\))|(?'pullvars'\w+(?:\.\w+)?)