For example I have a string like this:
first:second:third"test:test":fourth
I want to count the ':' and later to split every ':' to get the strings.
This is my regex:
/(.*):(.*)/iU
I don't know if this is the best solution, but it works. There is a different between a '.' and a "[...] : [...]" so I need to seperate them. I realized that my regex counts the : but continues when the : is between ".
I tried to solve this with this regex:
/(((.*)[^"]):((.*)[^"]))/iU
I thought this is the right way, but it isn't. I tried to learn the regex syntax, but I don't understand this problem.
This regex just means: search for ':' - every think can be infornt and after it EXCEPT wehen a " is in front of it AND a " is after it.
Maybe you can help me.
edit: I use my regex in PHP - maybe this is an important information
How about using
$result = preg_split(
'/: # Match a colon
(?= # only if followed by
(?: # the following group:
[^"]*" # Any number of characters except ", followed by one "
[^"]*" # twice in a row (to ensure even number of "s)
)* # (repeated zero or more times)
[^"]* # followed by any number of non-quotes until...
$ # the end of the string.
) # End of lookahead assertion
/x',
$subject);
which will give you the result
first
second
third"test:test"
fourth
directly?
This regex splits on a :
only if it's followed by an even number of quotes. This means that it won't split on a :
inside a string:
I love parsing text. So I write a parser for you.
$sample = 'first:second:third"test:test":fourth';
$len = strlen($sample);
$c =0;
$buffer="";
$output = array();
$instr = false;
for($i =0; $i< $len; $i++){
if($sample[$i]=='"' or $sample[$i]=="'"){
$c++;
$instr= $c%2==0 ? false: true;
$buffer.=$sample[$i];
}elseif(!$instr and $sample[$i]==':'){
$output[]=$buffer;
$buffer = "";
}else{
$buffer.=$sample[$i];
}
}
if($buffer) $output[] = $buffer;
print_r($output);
See the code in action. Also note for huge string regular expression will perform poor.
This regex should do it, if it match your needs and you want additional explanation, just ask :)
(?<=:|^)(?<!"[^:][^"]+:)\w+?(?=:|"|$)
That's the test string I used
"test1:test2:test3":first:second:third"test1:test2:test3":fourth:fifth"test1:test2:test3":sixth
And these are 6 following matches:
first
second
third
fourth
fifth
sixth