在两个分隔符之间提取字符串的最可靠方法

I've tried multiple functions to extract whatever between two strings, The delimiters might contain special characters, I guess that's why none worked for me.

My current function:

function between($str, $startTag, $endTag){
    $delimiter = '#';
    $regex = $delimiter . preg_quote($startTag, $delimiter) 
                        . '(.*?)' 
                        . preg_quote($endTag, $delimiter) 
                        . $delimiter 
                        . 's';
    preg_match($regex, $str, $matches);
    return $matches;
}

Example of string:

#{ST@RT}#
Text i want
#{END}#

#{ST@RT}#
Second text i want
#{END}#

How to improve that or suggest another solution to:

  • Support any kind of character or new lines
  • Extract multiple strings if found

Current Behavior: Only returns the first match, And also returns the match plus the surrounding tags which is unwanted

Use the m option for multi-line regular expressions (it allows the . character to match newlines):

preg_match('/foo.+bar/m', $str);
//                    ^--- this

Use preg_match_all() to get your multiple strings:

preg_match_all($regex, $str, $matches);
return $matches[1]; // an array of the strings

Edit:

The reason your current code returns the match plus the surrounding tags is because you're using return $matches. The $matches array has several elements in it. Index 0 is always the entire string that matched the expression. Indexes 1 and higher are your capture groups. In your expression, you had only one capture group (the "string"), so you would have wanted to only do return $matches[1] instead of return $matches.

You can use preg_match_all to extract multiple strings, besides that your code seems simple enough, normally simpler is faster.