In PHP I have the following string:
$text = "test 1
{blabla:database{test}}
{blabla:testing}
{option:first{A}.Value}{blabla}{option:second{B}.Value}
{option:third{C}.Value}{option:fourth{D}}
{option:fifth}
test 2
";
I need to get all {option
...} out of this string (5 in total in this string). Some have multiple nested brackets in them, and some don't. Some are on the same line, some are not.
I already found this regex:
(\{(?>[^{}]+|(?1))*\})
so the following works fine :
preg_match_all('/(\{(?>[^{}]+|(?1))*\})/imsx', $text, $matches);
The text that's not inside curly brackets is filtered out, but the matches also include the blabla
-items, which I don't need.
Is there any way this regex can be changed to only include the option
-items?
I modified your initial expression to search for the string '(option:)' appended with non-whitespace characters (\S*), bounded by curly braces '{}'.
\{(option:)\S*\}
Given your input text, the following entries are matched in regexpal:
test 1
{blabla:database{test}}
{blabla:testing}
{option:first{A}.Value} {option:second{B}.Value}
{option:third{C}.Value}
{option:fourth{D}}
{option:fifth}
test 2
Try this regular expression - it was tested using .NET regular expressions, it may work with PHP as well:
\{option:.*?{\w}.*?}
Please note - I'm assuming that you have only 1 pair of brackets inside, and inside that pair you have only 1 alphanumeric character
If you don't have multiple pairs of brackets on the same level this should works
/(\{option:(([^{]*(\{(?>[^{}]+|(?4))*\})[^}]*)|([^{}]+))\})/imsx
This problem is far better suited to a proper parser, however you can do it with regex if you really want to.
This should work as long as you're not embedding options inside other options.
preg_match_all(
'/{option:((?:(?!{option:).)*)}/',
$text,
$matches,
PREG_SET_ORDER
);
Quick explanation.
{option: // literal "{option:"
( // begin capturing group
(?: // don't capture the next bit
(?!{option:). // everything NOT literal "{option:"
)* // zero or more times
) // end capture group
} // literal closing brace
var_dump
ed output with your sample input looks like:
array(5) {
[0]=>
array(2) {
[0]=>
string(23) "{option:first{A}.Value}"
[1]=>
string(14) "first{A}.Value"
}
[1]=>
array(2) {
[0]=>
string(24) "{option:second{B}.Value}"
[1]=>
string(15) "second{B}.Value"
}
[2]=>
array(2) {
[0]=>
string(23) "{option:third{C}.Value}"
[1]=>
string(14) "third{C}.Value"
}
[3]=>
array(2) {
[0]=>
string(18) "{option:fourth{D}}"
[1]=>
string(9) "fourth{D}"
}
[4]=>
array(2) {
[0]=>
string(14) "{option:fifth}"
[1]=>
string(5) "fifth"
}
}