从超链接中删除尾随空格

I want to prevent spaces in hyperlinks on a UGC site. I have written regular expression it works perfect except its not removing trailing space from link and anchor text.

Here is my code:

$text = '< a href =   "   http://www.examplesite.com/       "> Example site   </a>';

$text = preg_replace('#(<(\s+)*a(\s+)*href(\s+)*=(\s+)*("|\')(\s+)*([^"]+)("|\')>(\s+)*([^<]+)(\s+)*</a>)#','<a href="$8">$11</a> ',$text);

Output

<a href="http://www.examplesite.com/      ">Example site  </a> 

URLs also contain spaces i.e. http://www.examplesite.com/blog/a page with space.html

Try this:

preg_replace("{<\s*a\s*href\s*=\s*(\"|')\s*([^\s]+)\s*\\1>\s*(.*?)\s+</a>}","<a href='$2'>$3</a>",$text);

Try this to remove extra spaces

function  RemoveExtraSpaces($str)
{
    while(strpos($str,"  "))
    {
        $str = str_replace("  ", "", $str);
    }
    return $str;
}   

I am no expert in regex but it seems you need a way to backtrack, you read all the way up to the closing " but you have to backtrack to the last non space character. I got no clue how to do that so what I would do is after you got your semi-cleared string

a) either str_replace or b) write a second regex

$str = str_replace(" '>","'>", $count);
$str = str_replace(" \">","\">", $count); 
$str = str_replace(" </a>","</a>", $count);

recursively until no more replacements can be done should work. Its primitive I know, but should do the job.