This is close, but is failing to match successive "attributes":
$string = "single attribute [include file=\"bob.txt\"] multiple attributes [another prop=\"val\" attr=\"one\"] no attributes [tag] etc";
preg_match_all('/\[((\w+)((\s(\w+)="([^"]+)"))*)\]/', $string, $matches, PREG_SET_ORDER);
print '<pre>' . print_r($matches, TRUE) . '</pre>';
Gives back the following:
Array
(
[0] => Array
(
[0] => [include file="bob.txt"]
[1] => include file="bob.txt"
[2] => include
[3] => file="bob.txt"
[4] => file="bob.txt"
[5] => file
[6] => bob.txt
)
[1] => Array
(
[0] => [another prop="val" attr="one"]
[1] => another prop="val" attr="one"
[2] => another
[3] => attr="one"
[4] => attr="one"
[5] => attr
[6] => one
)
[2] => Array
(
[0] => [tag]
[1] => tag
[2] => tag
)
)
Where [2] is the tag name, [5] is the attribute name and [6] is the attribute value.
The failure is on the second node - it catches attr="one"
but not prop="val"
TYIA.
(this is only meant for limited, controlled use - not broad distribution - so I don't need to worry about single quotes or escaped double quotes)
Unfortunately there is no way to repeat capture groups like that. Personally, I would use preg_match
to match the tags themselves (i.e. remove all the extra parentheses inside the regex), then foreach match you can then extract the attributes. Something like this:
$string = "single attribute [include file=\"bob.txt\"] multiple attributes [another prop=\"val\" attr=\"one\"] no attributes [tag] etc";
preg_match_all('/\[\w+(?:\s\w+="[^"]+")*\]/', $string, $matches);
foreach($matches[0] as $m) {
preg_match('/^\w+/', $m, $tagname); $tagname = $tagname[0];
preg_match_all('/\s(\w+)="([^"]+)"/', $m, $attrs, PREG_SET_ORDER);
// do something with $tagname and $attrs
}
Note that if you intend to replace the tag with some content, you should use preg_replace_callback
like so:
$string = "single attribute [include file=\"bob.txt\"] multiple attributes [another prop=\"val\" attr=\"one\"] no attributes [tag] etc";
$output = preg_replace_callback('/\[\w+(?:\s\w+="[^"]+")*\]/', $string, function($match) {
preg_match('/^\w+/', $m, $tagname); $tagname = $tagname[0];
preg_match_all('/\s(\w+)="([^"]+)"/', $m, $attrs, PREG_SET_ORDER);
$result = // do something with $tagname and $attrs
return $result;
});