PHP url验证误报

For some odd reason my if statement to check the urls using FILTER_VALIDATE_URL is returning unexpected results.

Simple stuff like https://www.google.nl/ is being blocked but www.google.nl/ isn't? Its not like it blocks every single URL with http or https infront of it either. Some are allowed and others are not, I know there are a bunch of topics for this but most of them are using regex to filter urls. Is this beter than using FILTER_VALIDATE_URL? Or Am I doing something wrong?

The code I use to check the URLS is this

if (!filter_var($linkinput, FILTER_VALIDATE_URL) === FALSE) {
    //error code
}

You should filter it like this first. (Just for good measure).

$url = filter_var($url, FILTER_SANITIZE_URL);

The FILTER_VALIDATE_URL only accepts ASCII URL's (ie, need to be encoded). If the above function does not work see PHP urlencode() to encode the URL.

If THAT doesn't work, then you should manually strip the http: from the beginning like this ...

$url = strpos($url, 'http://') === 0 ? substr($url, 7) : $url;

Here are some flags that might help. If all of your URL's will have http:// you can use FILTER_FLAG_SCHEME_REQUIRED

The FILTER_VALIDATE_URL filter validates a URL.

Possible flags:

  • FILTER_FLAG_SCHEME_REQUIRED - URL must be RFC compliant (like http://example)
  • FILTER_FLAG_HOST_REQUIRED - URL must include host name (like http://www.example.com)
  • FILTER_FLAG_PATH_REQUIRED - URL must have a path after the domain name (like www.example.com/example1/)
  • FILTER_FLAG_QUERY_REQUIRED - URL must have a query string (like "example.php?name=Peter&age=37")

The default behavior of FILTER_VALIDATE_URL

  • Validates value as URL (according to » http://www.faqs.org/rfcs/rfc2396), optionally with required components.

  • Beware a valid URL may not specify the HTTP protocol http:// so further validation may be required to determine the URL uses an expected protocol, e.g. ssh:// or mailto:.

  • Note that the function will only find ASCII URLs to be valid; internationalized domain names (containing non-ASCII characters) will fail.