用于密码字段验证的正则表达式[重复]

This question already has an answer here:

Sorry to all if this code is poor as I am just following through a book to and modifying it for my school project as I just started php less than a month ago.

I am trying to understand what this validation mean but can't seem to comprehend it full as I am new with php.

Code:

if (preg_match ('/^(\w*(?=\w*\d)(?=\w*[a-z])(?=\w*[A-Z])\w*){6,20}$/', $_POST['pass']) ) {

        //$p = mysqli_real_escape_string ($dbc, $_POST['pass']);
        $p = $_POST['pass'];
        $sticky_password = $p;

        } else {
        $error['pass'] = 'Please enter a valid password!';
        }

Any help would be really appreciated! Thanks!

Thank you very much.. :)

</div>

We have the following regex: /^(\w*(?=\w*\d)(?=\w*[a-z])(?=\w*[A-Z])\w*){6,20}$/

  1. The first and last / are the delimiters.

  2. ^ -> The start, $ -> The end

    Which means if input is abc and your regex is /^bc$/, it won't get matched since bc is not at the beginning.

  3. Now we have (\w*(?=\w*\d)(?=\w*[a-z])(?=\w*[A-Z])\w*){6,20}

    The {6,20} is the quantifier part, which means 6 up to 20 times.

  4. Let's break the regex further: \w*(?=\w*\d)(?=\w*[a-z])(?=\w*[A-Z])\w*

    Let's provide some equivalents:

    1. \w => [a-zA-Z0-9_]
    2. \d => [0-9]
    3. * => zero or more times
    4. (?=) Is a lookahead assertion. Example /a(?=b)/ this will match any "a" followed by "b"

    5. The purpose of those lookaheads:

      • (?=\w*\d) => check if there is a digit
      • (?=\w*[a-z]) => check if there is a lowercase letter
      • (?=\w*[A-Z]) => check if there is a uppercase letter
      • Let's take (?=\w*\d): The \w* is just there as a "workaround" in case there is [a-zA-Z0-9_]* before a digit

In the end, this regex just makes sure that the input:

  1. is 6 to 20 characters long
  2. that there is minimal: 1 lowercase, 1 uppercase and 1 digit
  3. that the allowed characters are letters (upper and lowercase (a-z,A-Z)), digits and underscore.

Three interesting sites www.regexper.com, www.regular-expressions.info and www.regex101.com.

Note: Don't restrict passwords, you have to hash them anyway. Take a look here or check the other questions on SO.

Actually this pattern matches all passwords with at least 3 characters (a digit, an upper case and a lower case in any order) without a length limit (except for the regex engine). All characters must be word characters (ie: from the class \w).

The author intention was probably to match all passwords between 6 and 20 word characters with at least one digit, one upper case and one lower case, as the quantifier and the lookaheads suggest.

In short this pattern is wrong, but probably aims what this classical password validation pattern does:

^(?=\w*[A-Z])(?=\w*[a-z])(?=\w*\d)\w{6,20}$

or this one without all redundant \w:

^(?=.*[A-Z])(?=.*[a-z])(?=.*\d)\w{6,20}$

Explanation: lookaheads are zero-width assertions, they don't consume characters, they are only tests. Consequence each of them are performed from the start of the string since they are anchored to the ^. If each lookahead succeeds, then \w{6,20} is tested always from the start of the string to know if it contains only between 6 and 20 word characters. $ means the end of the string.

As said in comments, this pattern comes from the book "Effortless E-commerce" first edition, by Larry Ullman.

Even if the author wrotes:

"And, admittedly, even I often have to look up the proper syntax for patterns, but this one requires a high level of regular expression expertise."

and even if I disagree with the second part of the sentence, I think this pattern is a simple typo. However I don't know if this one has been corrected in the second edition.