正则表达式关闭标签更换问题

We have a piece of regex that adds a <strong> tag around keywords if they are not within a certain closing tag themselves. This has always worked nicely...

foreach ($keywords as $keyword) {
    $str = preg_replace("/(?!(?:[^<]+>|[^>]+(<\/strong>|<\/a>|<\/b>|<\/i>|<\/u>|<\/em>)))\b(" . preg_quote($keyword, "/") . ")\b/is", "<strong>\\2</strong>", $str, 1);
}

So if the keyword was test this would change:

A test line

to:

A <strong>test</strong> line

but this would not change:

<a href="">A test line</a>

As you can see the list of closing tags we want it to ignore is in the regex.

We have encountered a problem with a string that looks like:

<a href="">A test <em>line</em></a>

It's not recognising the closing </a> or </em> for that matter, so it's coming out as...

<a href="">A <strong>test</strong> <em>line</em></a>

Which we don't want it to do. Can anyone see if there is a fix to this (and yes I am aware of the don't parse HTML with regex rule so posting links to that infamous post is not an answer ;-))

The following regex try to match the keyword test not enclosed by either a,b,i,u,em,strong tags.

Regex

/^.*?(?!<(a|b|i|u|em|strong).*?>.*?)\btest\b(?!.*?<\/\1>)/i

Test

A test line                          => MATCH
<a href="">A test line</a>           => NO MATCH
<a href="">A test <em>line</em></a>  => NO MATCH

Discussion

^.*?(?!<(a|b|i|u|em|strong).*?>.*?)   => The keyword `test' must not be preceded by 
                                         any tag listed followed by any character
\btest\b                              => Here we define the keyword we want to match
(?!.*?</\1>)                          => The keyword `test' must not be followed by
                                         the tag opened previously

Tip

You can enhance the regexp for multiple keywords (kw1,kw2,kw3 in the example below) like this :

/^.*?(?!<(a|b|i|u|em|strong).*?>.*?)\b(?:kw1|kw2|kw3)\b(?!.*?<\/\1>)/i

Warning

This regex actually works on the provided test but not in all cases.