Following a protocol document, I can receive an array of parameters encoded as an string in which each parameter is separated from others by a blank space. The spaces within parameters are escaped with backslashes.
So, let's say we have the following strings within a parameter array:
array('Eli is beautiful', 'Fran is ugly', 'Oso is nice')
These will be encoded in just one string as follows:
Eli\ is\ beautiful Fran\ is\ ugly Oso\ is
ice
Encoding is not a major problem, but I am facing problems with decoding.
I try to split parameters using a regular expression, that should splits by spaces that are not after a backslash, so this is my code:
$params = preg_split('/[^\\\\]\s/', $str);
It splits the params as expected, but it also removes the last char for each one, so this is the output of var_dump:
array(3) {
[0]=>
string(15) "Eli is beautifu"
[1]=>
string(11) "Fran is ugl"
[2]=>
string(11) "Oso is nice"
}
Does someone knows how to solve this?
TIA,
Simply use negative lookbehind:
$params = preg_split('/(?<!\\\\) /', $str);
The regex above matches every space that is not preceded by a backslash, which is exactly what you intend.
Update: Your previous regex eats up letters because it matches the character preceding the space (as long as it's not a backslash); therefore that character is considered part of the delimiter and removed from the output along with the space.
The lookbehind version asserts that no backslash precedes the space, but does not match the character -- an important difference.