匹配子字符串的子字符串

Input:

Animal {cow.<span>moo</span>} <span>noises</span>

Output:

Animal {cow.moo} <span>noises</span>

How could I match only the <span> inside the braces for replacement? I have got as far as matching everything between the braces with

(?<=\{)(.*?)(?=\})

Any help would be greatly appreciated.

You may use a preg_replace_callback to match the strings in between {...} with a basic regex like '~{[^}]+}~' and then replace what you need inside the callback function:

$s = 'Animal {cow.<span>moo</span>} <span>noises</span>';
echo preg_replace_callback('~{[^}]+}~', function($m) {
    return str_replace(["<span>", "</span>"], "", $m[0]);
}, $s);
// => Animal {cow.moo} <span>noises</span>

See the PHP demo.

You may use a preg_replace inside the callback function if you need to replace with a regex.

If you want to capture the moo inside the span for substitution you can use this regexp:

\{.*?\<span\>(.*)\<\/span\>\}   

Here is an example: https://regex101.com/r/ryP3Y6/1

If you want to delete the <span> tags you can use:

(.*\{.*?)(?:\<[\/]?span\>)(.*)(?:<\/span\>)(\}.*)

And use \1\2\3 for substitution:

Check: https://regex101.com/r/qECuai/1

With a \G based pattern:

$str = preg_replace('~(?:\G(?!\A)|{)[^}]*?\K</?span>~', '', $str);

This pattern starts with two possible branches:

  • the second branch is the one that matches first, starts with a { to ensure you are inside curly brackets. Note that you can add (?=[^}]*}) after it to ensure there's a closing bracket.
  • the first branch starts with \G that is the position after the last match.

[^}]*? forbids to go out of the curly bracket enclosed substring.

This design ensures that all series of matches starting from a { are contiguous and that the <span> tags found are between curly brackets.


For small strings and if you are sure curly brackets are balanced and not nested, you can also use a more simple pattern:

$str = preg_replace('~</?span>(?=[^{}]*})~', '', $str);