I'm pretty new at the subject preg
and I'm using this preg_match
condition to check if the user has entered whitespace, unicode letters, digits, underscore or dash:
if(preg_match("/[^\040\pL\pN_-]/u", $term)) {
But now I wanted to allow a comma. So I tried this:
if(preg_match("/[^\040\pL\pN,_-]/u", $term)) {
And it actually works and I just wanted to know why. I just want to understand it better. Why does it have to be ,_-
and not -_,
for example to allow the comma?
I would really appreciate if someone could explain this to me step by step.
This is because -
is used for ranges in square brackets([]
-> character classes). And as from the manual: indicates character range
, example: 0-9
or a-z
.
So as long as you put it at the end you're fine and don't have to escape it. In all other cases you have to escape it with a backslash e.g. \-
.
Means:
,_- //At the end _,- //At the end \-,_ //Escape it \-_, //Escape it ,\-_ //Escape it _\-- //Escape it
When we're working inside brackets or with a character class, -
indicates a character range, like [a-z]
. Therefore, if you place a dash anywhere but at the very end of the character class, it will not be interpreted as a literal dash, but instead as a character range indicator.