正则表达式基础 - [^。] [复制]

This question already has an answer here:

I am trying to understand the following regular expression which gets the domain name out of a URL

$host = "www.php.net"

// get last two segments of host name
preg_match('/[^.]+\.[^.]+$/', $host, $matches);
echo "domain name is: {$matches[0]}
";

how is $matches[0] coming as php.net ?

I am stuck with the patter [^.]. ^ means complimentary and . means any char. so what does [^.] mean? complimentary of any char? Please help

</div>

It can be tricky if you're new to it.

. normally means any non-space character. In a range ([]), however, it reverts to its literal meaning, i.e. a full stop (period, if you're American.)

^ normally means "anchor to the start of the string." In a range, however, when it's the first character in that range, it flips the logic, so rather than the range representing the allowed characters, it represents the DISallowed characters.

So the pattern '/[^.]+\.[^.]+$/' says:

  1. Match a sequence of one or more non-space characters that are not periods (the host)
  2. Match a period thereafter
  3. Match another sequence of one or more non-space characters that are not periods (the suffix)
  4. The $ anchors this to the end of the string, so steps 1-3 must be in sequence right up to the last character of the string.

Incidentally, it's not exactly a water-tight pattern for hostname matching. It doesn't take into account sub-domains or country-specific domains with more than one period (e.g. ".co.uk"), to name two points.

In a character class ([]) a dot means just that: a liteal dot ..

So [^.] means any character except a dot/period.

You understood it wrong. When dot (.) is INSIDE a bracket it means a DOT, not any char. So, [^.] means every character that is not a dot.