PHP正则表达式:零个或多个空格不起作用

I'm trying to apply a regex constraint to a Symfony form input. The requirement for the input is that the start of the string and all commas must be followed by zero or more whitespace, then a # or @ symbol, except when it's the empty string.

As far as I can tell, there is no way to tell the constraint to use preg_match_all instead of just preg_match, but it does have the ability to negate the match. So, I need a regular expression that preg_match will NOT MATCH for the given scenario: any string containing the start of the string or a comma, followed by zero or more whitespace, followed by any character that is not a # or @ and is not the end of the string, but will match for everything else. Here are a few examples:

preg_match(..., '');              // No match
preg_match(..., '#yolo');         // No match
preg_match(..., '#yolo,  #swag'); // No match
preg_match(..., '#yolo,@swag');   // No match
preg_match(..., '#yolo, @swag,'); // No match

preg_match(..., 'yolo');        // Match
preg_match(..., 'swag,#yolo');  // Match
preg_match(..., '@swag, yolo'); // Match

I would've thought for sure that /(^|,)\s*[^@#]/ would work, but it's failing in every case with 1 or more spaces and it appears to be because of the asterisk. If I get rid of the asterisk, preg_match('/(^|,)\s[^@#]/', '#yolo, @swag') does not match (as desired) when there's exactly once space, but as as soon as I reintroduce the asterisk it breaks for any quantity of spaces > 0.

My theory is that the regex engine is interpreting the second space as a character that is not in the character set [@#], but that's just a theory and I don't know what to do about it. I know that I could create a custom constraint to use preg_match_all instead to get around this, but I'd like to avoid that if possible.

You may use

'~(?:^|,)\s*+[^#@]~'

Here, the + symbol defines a *+ possessive quantifier matching 0 or more occurrences of whitespace chars, and disallowing the regex engine to backtrack into \s* pattern if [^@#] cannot match the subsequent char.

See the regex demo.

Details

  • (?:^|,) - either start of string or ,
  • \s*+ - zero or more whitespace chars, possessively matched (i.e. if the next char is not matched with [^#@] pattern, the whole pattern match will fail)
  • [^@#] - a negated character class matching any char but @ and #.