With preg_match_all I want to get class and data-attributes in html.
The example below works, but it only returns class names or only data-id content.
I want the example pattern to find both class and data-id content.
Which regex pattern should I use?
Html contents:
<!-- I want to: $matches[1] == test_class | $matches[2] == null -->
<div class="test_class">
<!-- I want to: $matches[1] == test_class | $matches[2] == 1 -->
<div class="test_class" data-id="1">
<!-- I want to: $matches[1] == test_class | $matches[2] == 1 -->
<div id="test_id" class="test_class" data-id="1">
<!-- I want to: $matches[1] == test_class test_class2 | $matches[2] == 1 -->
<div class="test_class test_class2" id="test_id" data-id="1">
<!-- I want to: $matches[1] == 1 | $matches[2] == test_class test_class2 -->
<div data-id="1" class="test_class test_class2" id="test_id" >
<!-- I want to: $matches[1] == 1 | $matches[2] == test_class test_class2 -->
<div id="test_id" data-id="1" class="test_class test_class2">
<!-- I want to: $matches[1] == test_class | $matches[2] == 1 -->
<div class="test_class" id="test_id" data-id="1">
The regex that does not work as I want:
$pattern = '/<(div|i)\s.*(class|data-id)="([^"]+)"[^>]*>/i';
preg_match_all($pattern, $content, $matches, PREG_SET_ORDER);
Thanks in advance.
Why not use a DOM parser instead?
You could use an XPath expression like //div[@class or @data-id]
to locate the elements then extract their attribute values
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXpath($doc);
$divs = $xpath->query('//div[@class or @data-id]');
foreach ($divs as $div) {
$matches = [$div->getAttribute('class'), $div->getAttribute('data-id')];
print_r($matches);
}
Demo ~ https://eval.in/1046227
I second Phil's answer, I think HTML parser is the way to go. It is safer and can handle much complicated things.
Having said that, if you want to try regex in your example, it would be something like this:
<(?:div|i)(?:.*?(?:class|data-id)="([^"]+)")?(?:.*?(?:class|data-id)="([^"]+)")?[^>]*>
Example: https://regex101.com/r/Gb82lF/1/