I'm trying to explode a string by vertical bars. That's the easy part. However, I DON'T want the split to affect substrings that are surrounded by parentheses. That means I need a string such as:
Hello (sir|maam).|Hi there!
to explode into:
Array
(
[0] => Hello (sir|maam).
[1] => Hi there!
)
By using the normal explode function, I don't believe there is a way to tell it to ignore that bar surrounded by the parentheses. However, I have some ideas.
I know that it would be possible to do this by exploding the string normally, and then looping through the array and merging everything between strings that contain (
to the closing string that contains )
. However, I have a feeling that there should be a more elegant way of achieving this.
Am I right? Is there a less code-intensive means of spliting a string into an array given these restrictions?
If you can guarantee the parentheses will be balanced and never nested (that is, if there will never be a 'Oops(!'
or a '(nested stuff (like this)|oops)'
), and there will never be a ||
outside of parentheses that you care to match as an empty string, then this ought to help:
preg_match_all('/(?:[^(|]|\([^)]*\))+/', $your_string, $matches);
$parts = $matches[0];
It'll match [either (a character that's not a |
or (
), or a (
and )
enclosing anything that's not a )
(which includes |
)], as many times as possible (but at least once). Short version: it'll make |
between parentheses part of the match, rather than a separator.
Another possibility, that is slightly less cryptic:
$parts = preg_split('/\|(?![^(]*\))/', $your_string);
Uses a lookahead assertion to disqualify any |
that's followed by a )
if there's not a (
in between. Still a bit unforgiving about parens, but it will match empty strings between two |
s.
Until someone writes a regex based solution, which I doubt is possible with a single pass, this should work. It is a straightforward translations of requirements to the code.
<?php
function my_explode($str)
{
$ret = array(); $in_parenths = 0; $pos = 0;
for($i=0;$i<strlen($str);$i++)
{
$c = $str[$i];
if($c == '|' && !$in_parenths) {
$ret[] = substr($str, $pos, $i-$pos);
$pos = $i+1;
}
elseif($c == '(') $in_parenths++;
elseif($c == ')') $in_parenths--;
}
if($pos > 0) $ret[] = substr($str, $pos);
return $ret;
}
$str = "My|Hello (sir|maam).|Hi there!";
var_dump(my_explode($str));