With this regular expression can not validate the text in the following languages:
/^[\p{L}\p{Nd}-_.]{1,20}$/u
Languages that do not work:
Bengali, Gujarati, Hindi, Marathi, Thai, Tamil, Telugu, Vietnamese
when used with PHP's preg_match
.
What am I missing?
You're using the dash incorrectly. If you want it to match a literal dash character, you need to either escape it (\-
) or put it at the end of the character class.
Also, I'm not familiar with those languages, but I guess you might need to account for marks as well:
/^[\p{L}\p{Nd}\p{M}_.-]{1,20}$/u
The problem doesn't come from your regex (except the fact that the character -
must be always at the begining or at the end of a character class) . Note that your pattern can be shorten as:
/^[\w.-]{1,20}$/u
or
/^[\p{Xan}.-]{1,20}$/u
if you want to remove the underscore