在PHP中编写类似Scala的解析器组合器

I had been coding in Scala for several months before I was forced again to do some stuff in PHP. I have realized that for my project it would be handy to have parser combinators ready in this language.

I have found the Loco implementation however I was greatly disappointed by that (especially due to the fact that it is insanely verbose compared to Scala).

I started to implement parser combinators in PHP myself using second order functions. An example of regular expression parser follows:

interface Result {};
class Success implements Result { function __construct($payload, $next) { $this->payload = $payload; $this->next = $next; } }
class Failure implements Result { function __construct($payload, $next) { $this->payload = $payload; $this->next = $next; } }

function r($regex) {
  return function($input) use ($regex) {
    if(preg_match($regex, $input, $matches)) {
      return new Success($matches[0], substr($input, strlen($matches[0])));
    } else {
      return new Failure('Did not match', $input);
    }
  };
}

And the cons as an example of a combinator:

function consF($fn) {
  $args = array_slice(func_get_args(), 1);
  return function($input) use ($fn, $args) {
    $matches = array();
    foreach($args as $p) {
      $r = $p(ltrim($input));
      if($r instanceof Failure) return $r;

      $input = $r->next;
      $matches[] = $r->payload;
    }

    return new Success($fn($matches), $input);
  };
}

This allows me to write a parser quite compactly - like this:

$name = r('/^[A-Z][a-z]*/');
$full_name = consF(function($a) { return $a; }, $name, $name);

The problem arises when the grammar needs to be recursive - in such case I can not order the variables such that all variables are defined once I use them. Eg. for writing a grammar that would parse input of brackets like (()()) I would need something like this:

$brackets = alt('()', cons('(', $brackets, ')'));

where the alt combinator succeeds if one of the alternatives succeeds. Passing the variable as a reference should solve it, however the new versions of PHP requires that the passing by reference is indicated in the function declaration - which is not possible when using a function with variable number of arguments.

I have solved this issue by passing a function as argument like this:

function($input) {
  $fn = $GLOBALS['brackets'];
  return $fn($input);
}

However this is really nasty and it requires the parsers to be defined in the topmost scope (which is also not a good idea).

Could you please give me some trick which would help me to overcome this issue without the need of too much of additional code while defining the grammar?

Thanks