过滤URL时出错

QUESTION EDITED COMPLETELY

Hello,

I'm using this code for validating URL :

$url = preg_replace("/[^A-Za-z0-9-\/\.\:]/", "", trim($url)); // clean invalid chars and space
$url = preg_replace('%^(?!https?://).*%', 'http://$0', $url); // add HTTP:// , if there isn't
if (FALSE === strpos($url, '://www.')) // if there isn't WWW
{
    $url = str_replace('://', '://www.', $url); // add WWW
}

But there is a problem. If $url has a subdomain (like http://blog.example.com) , this codes still adding www (http://www.blog.example.com) .

How can i fix it ? If there is a subdomain, don't add www .

I think, substr is actually supposed to be strpos?

I doubt this code ever worked. Since you're not checking for identity (===), the condition is always true, thus prepends www.. That should work however:

if (FALSE === strpos($url, '://www.'))
   $url = str_replace('://', '://www.', $url);

There's no need to replace using expensive regular expressions in this case, so you should use str_replace.


UPDATE: The question had been edited. I suggest the following:

// Strip "invalid" characters
$url = preg_replace('/[^a-z0-9\.\-]/i', '', $url);

// Split URL by scheme, host, path (and possibly more)
$parts = parse_url($domain);

if (empty($parts['scheme']))
   $parts['scheme'] = 'http';
if (!strcmp('example.com', $parts['host']))
   $parts['host'] = 'www.example.com';

// Reconstruct URL
$url = sprintf('%s://%s%s', $parts['scheme'], $parts['host'], $parts['path']);

Be aware, that parse_url may return a lot more. You'll need to reconstruct accordingly.