i am using a Regular Expression that validates an email address here is the regular expression i am using.
preg_match("/^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$/", $email)
most of the above code are self explanatory like
a) ^ represents NOT.
b) the start of the string should be either _ a-z 0-9
c) match the next character which starts with dot
d) now what does *@ means here, couldn't it be just @ which means the next character should be @
e) next again it will try and find dot, the first dot is optional and the second is compulsory.
f) in the end what does $ means?
Your assumption a) is not true
^
is the start of the string in this case. At the beginning of a character class its a NOT.
[_a-z0-9-]+
will match any of the chars in []
one or more times (because of the +
)
(\.[_a-z0-9-]+)*
then there is a dot the same pattern than before and the *
means this complete part can be repeated 0 or more times
Then there has to be the character @
Then the part from before the @
repeats
(\.[a-z]{2,3})$
the string has to end (defined by the $
) with a .
and 2 or 3 lowercase letters
* means the preceding rule 0 or multiple times
while $ in this case means the end of the string
(\.[_a-z0-9-]+)* // these characters can appear 0 or multiple times
(\.[a-z]{2,3})$ // the string ends with 2 letters in lowercase alphabet
lots of information about regexp can be found at http://www.regular-expression.info
For example: f) see § Anchor at http://www.regular-expressions.info/quickstart.html
Since the existing answers didn't cover this...
Yes, *
means "zero or more times", but it is also "greedy" by default, so it will match as many times as possible, even if part of the matched string would have caused the next part of the pattern to match. *
can be made "lazy" (allowing the pattern to "backtrack" to allow further matches in the pattern) by appending a ?
: *?
.