使用preg_replace删除锚标记后的空格

I want to put a space after anchor tag so that the next word becomes separate from it. The problem is there are anchor tags after which there is   characters or there could be another html tag opening. So in those cases we do not want to put a space as it will break our records.

I only want to put space after anchor if there is no space and there is a word.

Right now i have come up with regex which i am not sure is exactly what i want

 preg_replace("/\<\/a\>([^\s<&nbsp;])/", '</a> $1', $text, -1, $count);
 print "Number of occurence in type $type = $count 
";
 $this->count += $count;

I tried to see the number of occurence before i actually save the replaced string. But it is showing way higher amount which i highly doubt cannot be.

Please help me fixing this regex.

Scenarios:

<a href="blah.com">Hello</a>World // Here we need to put space between Hello and World

<a href="blah.com">Hello</a>&nbsp;World // Do not touch this

<a href="blah.com">Hello</a><b>World</b> // do not touch this

There could be so many cases that has to be ignore but specifically speaking we need the first scenario to be executed

As @trincot pointed out [^\s<&nbsp;] doesn't mean if it is not a space or non-breaking space. It's a character class and whatever is between those brackets has a mean of a single character only. So it means if it is not a space or < or & or...

You need to check if very next character is a word character \w which denotes [a-zA-Z0-9_], then consider to add an space at zero-width assertion of used positive lookahead:

 preg_replace("~</a>\K(?=\w)~", ' ', $text, -1, $count);
 echo "Number of occurrences in type $type is $count 
";

What does this RegEx mean?

</a>    # Match closing anchor tag
\K      # Reset match
(?=\w)  # Look if next character is a word character

Update: Another solution to cover all HTML-problematic cases:

preg_replace("~</a>\K(?!&nbsp;)~", '&nbsp;', $text, -1, $count);

This adds a non-breaking space when there is no non-breaking space after closing anchor tag.

You can use: /(?<=<\/a>)(\w+)/g regex

Meaning: find the word preceded by closing anchor tag and replace it with space and first capture group reference($1)

Demo and Meaning of each construct used

As you will probably find out, the regex solution will sooner or later prove insufficient. For example, it will not detect that in this HTML snippet the two words are displayed without white space between them:

<a>test</a><span>hello</span>

There are numerous other cases where a regex solution would have a hard time to detect adjacent words like that, as the rendering of HTML is not as straightforward as it may seem.

Although you already accepted a solution, I here provide a solution that uses the DOMDocument interface available in PHP to detect where link texts would stick to the text that follows it, even if it is remotely separated from it in the DOM node hierarchy:

function separateAnchors($html) {
    // Define a character sequence that 
    // will certainly not occur in your document,
    // and is interpreted as literal in regular expressions:
    $magicChar = "²³²"; 
    $doc = new DOMDocument();
    $doc->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
    $xpath = new DOMXPath($doc);
    $anchors = $xpath->query("//a");
    foreach (array_reverse(iterator_to_array($anchors)) as $anchor) {
        $parent = $anchor->parentNode;
        $origAnchor = $anchor->cloneNode(true);
        // temporariy put the special text in the anchor
        $anchor->textContent = $magicChar;
        // and then take the document's text content
        $txt = $doc->textContent;
        // If that contains the special text with a non-space following it:
        if (preg_match("/{$magicChar}\S/u", $txt)) {
            // ... then add a single space node after it, after
            // any closing parent nodes
            $elem = $anchor;
            while (!$elem->nextSibling) $elem = $elem->parentNode;
            $elem->parentNode->insertBefore($doc->createTextNode(" "), 
                                            $elem->nextSibling);
        }
        // Put original anchor back in place
        $parent->replaceChild($origAnchor, $anchor);
    }
    return $doc->saveHTML();
}

// sample data
$html = "<p><a>first link</a>&nbsp;<a>second link</a>this word is too close</p>

         <table><tr><td><a>table cell</a></td></tr></table><span>end</span>

         <span><a>link</a></span><span><a>too close</a></span>";

// inject spaces
$html = separateAnchors($html);

// Show result
echo $html;

See it run on ideone.com