I'm trying to replace text which is between underscores with the tag. This is what pattern I'm currently using (Link to online tester: TESTER):
[^\\]?_(([^_]*)[^\\])_
This is the result I want to get:
_test1_ _test2__test3_ \_test4\_ => <b>test1</b> <b>test2</b><b>test3</b> \_test4\_
Can anyone tell me whats wrong with my pattern?
You may use
(?<!\\)((?:\\{2})*)_([^_\\]*(?:\\.[^_\\]*)*)_
PHP declaration:
$pattern = '~(?<!\\\\)((?:\\\\{2})*)_([^_\\\\]*(?:\\\\.[^_\\\\]*)*)_~';
See the regex demo
Details:
(?<!\\)((?:\\{2})*)_
- matches an unescaped _
: any number of double \
symbols (see (?:\\{2})*
, 0+ sequences of two consecutive \
symbols) that are not preceded with a \
((?<!\\)
negative lookbehind performs this check)([^_\\]*(?:\\.[^_\\]*)*)_
- matches any number of symbols other than _
or any number of escaped symbols thus only matching up to the first unescaped _
.[^_\\]*
- matches 0+ chars other than \
and _
(?:\\.[^_\\]*)*
- 0+ sequences of:\\.
- any escaped char (if you use s
DOTALL modifier, even a line break char)[^_\\]*
- 0+ chars other than \
and _
To use the same approach in JavaScript and other regex engines that do not support a lookbehind, use (^|[^\\])
group instead of (?<!\\)
:
(^|[^\\])((?:\\{2})*)_([^_\\]*(?:\\.[^_\\]*)*)_
And replace with $1$2<b>$3</b>
. See this regex demo.