I am building a online IM service and have been working on a posting algorithm for some time now. The next task in this project is the ability for users to post links to other sites by simply typing in the URL.
I have tried two separate methods:
function autolink($string)
{
$content_array = explode(" ", $string);
$output = '';
foreach($content_array as $content)
{
//starts with http://
if(substr($content, 0, 7) == "http://")
$content = '<a href="' . $content . '">' . substr($content, 7) . '</a>';
//starts with www.
if(substr($content, 0, 4) == "www.")
$content = '<a href="http://' . $content . '">' . substr($content, 7) . '</a>';
$output .= " " . $content;
}
$output = trim($output);
return $output;
}
and
$message = preg_replace('/https?:\/\/[^\s<]+/i', '<a href="\0">\0</a>', $message);
Both of these replace URLs with links; however, neither go quite as in depth as I need. For example if a URL say https://stackoverflow.com/, is immediately followed by a comma, the comma is taken into the URL and therefore is rendered unusable. The same goes for brackets full stops and other forms of parenthesis.
All HTML tags are stripped from the message at the beginning of the algorithm so manually typing in link tags is not an option.
So in effect what I need is a simple to use function that identifies URLs and ensures that all trailing punctuation is discarded and also can be used on URLs with http:// or www. preceding them.
From: http://blogs.lse.ac.uk/clt/2008/04/23/a-regular-expression-to-match-any-url/
$pattern = |([A-Za-z]{3,9})://([-;:&=+$,w]+@{1})?([-A-Za-z0-9.]+)+:?(d+)?((/[-+~%/.w]+)???([-+=&;%@.w]+)?#?([w]+)?)?|;
$html = preg_replace($pattern, '<a href="$0">$0</a>', $text);
seems to be what you are looking for.