允许空格,unicode字母,数字,下划线,短划线和逗号?

I'm pretty new at the subject preg and I'm using this preg_match condition to check if the user has entered whitespace, unicode letters, digits, underscore or dash:

if(preg_match("/[^\040\pL\pN_-]/u", $term)) {

But now I wanted to allow a comma. So I tried this:

if(preg_match("/[^\040\pL\pN,_-]/u", $term)) {

And it actually works and I just wanted to know why. I just want to understand it better. Why does it have to be ,_- and not -_, for example to allow the comma?

I would really appreciate if someone could explain this to me step by step.

This is because - is used for ranges in square brackets([] -> character classes). And as from the manual: indicates character range, example: 0-9 or a-z.

So as long as you put it at the end you're fine and don't have to escape it. In all other cases you have to escape it with a backslash e.g. \-.

Means:

,_-  //At the end
_,-  //At the end
\-,_ //Escape it
\-_, //Escape it
,\-_ //Escape it
_\-- //Escape it

When we're working inside brackets or with a character class, - indicates a character range, like [a-z]. Therefore, if you place a dash anywhere but at the very end of the character class, it will not be interpreted as a literal dash, but instead as a character range indicator.