I need to get the number in curly braces in a complex string. Basically I am parsing some data, and I need to extract the ids that are represented in a curly braces.
Example string:
{1|{1078} {*|{1079}-}test{1|{4829}, test2 {4457}}} {*|{1078} {*|{1079}-}test3{1|{4829}, test4 {23232}}}
What I exactly need is to extract the number in curly braces that is near the pipe (|{4829}, |{1079}, |{1078}), and not any others numbers, so my end result would be something like:
4829,1079,1078
or array of this numbers, it does not matter. It should be unique values but that is not problem for me. My problem is create regex that will just extract those numbers. I have tried a lot of stuff during this day, latest one what I have tried is this:
public static function getAllAttributeIDsFromTheRule($attributeValues)
{
preg_match_all('/{(.*?)}/', $attributeValues, $matches);
preg_match_all('\|{\d*}', $attributeValues, $matches[1]);
$attributeIDsWithPipe = (implode('', self::clean($matches[1])));
$attributeIDs = explode('|', $attributeIDsWithPipe);
var_dump($attributeIDs);
}
public static function clean($string)
{
$string = str_replace(' ', '-', $string);
return preg_replace('/[^A-Za-z0-9|\-]/', '', $string);
}
But I am always stuck with one other character in the result.In some result I get number extra or something like that. Now, it is time to ask for help if someone knows the better approach. Much appreciated.
You may use a regex to match |{
, some 1+ digits, and }
and capture the digits inside into a capturing group, and then just access the values from the group using $matches[1]
:
if (preg_match_all('~\|\{(\d+)}~', $s, $matches)) {
print_r($matches[1]);
}
See the regex demo and the regex graph:
$s = '{1|{1078} {*|{1079}-}test{1|{4829}, test2 {4457}}} {*|{1078} {*|{1079}-}test3{1|{4829}, test4 {23232}}}';
if (preg_match_all('~\|\{(\d+)}~', $s, $matches)) {
print_r(array_unique($matches[1]));
}
// => Array ( [0] => 1078 [1] => 1079 [2] => 4829 )
NOTE: array_unique
will keep unique values only in the results.
You could use this pattern to match the digits after the curly and the pipe:
\|{\K\d+(?=})
\|{
Match a pipe and {
\K
Forget what was matched\d+
Match 1+ digits(?=})
Assert what is on the right is a closing }
That will give you all the values from your example data including duplicates. If you want the result specified in your question you could deduplicate the array with array_unique.
For example:
$re = '/\|{\K\d+(?=})/m';
$str = '{1|{1078} {*|{1079}-}test{1|{4829}, test2 {4457}}} {*|{1078} {*|{1079}-}test3{1|{4829}, test4 {23232}}}
';
preg_match_all($re, $str, $matches);
rsort($matches[0]);
echo implode(',', array_unique($matches[0]));
Result
4829,1079,1078
See a php demo
Another option with a less efficient pattern but prevents to use array_unique afterwards is to use 2 capturing groups and a backreference to group 1 and get the last match using a negative lookahead:
(\|{(\d+)})(?!.*\1)
$re = '/(\|{(\d+)})(?!.*\1)/';
$str = '{1|{1078} {*|{1079}-}test{1|{4829}, test2 {4457}}} {*|{1078} {*|{1079}-}test3{1|{4829}, test4 {23232}}}
';
preg_match_all($re, $str, $matches);
print_r($matches[2]);