I'm looking to create a PHP Regex script that can match and replace words within a string.
The regex needs to match only complete words, which I can easily accomplish with:
/\b(SEARCH_TERM)\b/
The problem I am having is that some of the strings contain html elements as such as <a> tags and <img> tags, where the href and src attributes may sometimes contain the to-be-replaced word within their path. If this word is replaced within these elements, then the link or image will no longer work.
Example, replace the word 'test' with 'SEARCH_TERM' for the following example string:
my test string <a href="http://www.google.com?q=my+test+string">link</a>
Would return:
my SEARCH_TERM string <a href="http://www.google.com?q=my+SEARCH_TERM+string">link</a>
Whereas I need it to ignore the href attribute text and return:
my SEARCH_TERM string <a href="http://www.google.com?q=my+test+string">link</a>
I've looked at using Regex's Lookbehind Assertions (As just below), but variable length wildcard characters are not allowed.
/(?<!(href|src)=.*)\b(SEARCH_TERM)\b/
Note: I specifically need to do this with Regex, and not a DOM parser.
As I mentioned you need to use an html parser.
But if you want it
/\btest\b(?=[^>]*(<|$))/s
Above regex would match only if there's <
or end of string
(not line) ahead somewhere without matching >
NOTE
This would not work if your text itself contains >
.
For example
hello>world
Hence the reason you should use a parser