如何使用php简单的html dom或Curl从div中抓取HTML标签

Here is an Example of what i want to do Example:

<div class='room'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>

From the above emaple I would like to scrap data and tags in arrays. In the result I would like an array containing: arr = [h1,p,h2]; and another array: arr2 = [This is h1,This is paragraph,This is h2]

$str = <<<EOF
<div class='room'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>
EOF;

$html = str_get_html($str);

foreach($html->find('.room *') as $el){
  $arr[] = $el->tag;
  $arr2[] = $el->text();
}

Try this;

$str = "<div class='room'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>";

$arr = explode(PHP_EOL, $str);

$res =array();
Foreach($arr as $row){
    If(!strpos($row, "div") !== False){
        $res[substr($row, 1, strpos($row, ">")-1)] = strip_tags($row); 
    }
}

Var_dump($res);

https://3v4l.org/8TkIT

It loops through one line at the time and creates the array with named keys.

Edit if there is more than one room you can make it multidimensional like this:
https://3v4l.org/DdXVd

$str = "<div class='room'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>
<div class='room2'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>";

$arr = explode(PHP_EOL, $str);

$res =array();
Foreach($arr as $row){
    If(strpos($row, "div") !== False){
        $pos1 = strpos($row, "'")+1;
        $room = substr($row, $pos1, strpos($row, "'", $pos1)-$pos1);
    }Else{
        $pos1 = strpos($row, "<")+1;
        $res[$room][substr($row, strpos($row, "<")+1, strpos($row, ">")-$pos1)] = trim(strip_tags($row)); 
    }
}

Var_dump($res);

Assuming the elements are known you could use the domdocument's getelementsbytagname like this:

$html = "<div class='room'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>";
$doc = new DOMDocument();
$doc->loadhtML($html);
$elements = array();
$content = array();
function iterate_elements($array, $doc){
     global $elements, $content;
     foreach($array as $element){
          $the_element = $doc->getElementsByTagName($element);
          foreach($the_element as $target){
               $content[] = $target->textContent;
               //$target->tagName;         
          }
          if(!empty($the_element->length)) {
               $elements[] =  $element;
         }
     }
}
iterate_elements(array('h1','p', 'h2'), $doc);
print_r($elements);
print_r($content);

Demo: https://eval.in/825860

try below code.

$html = "<div class='room'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>";

$dom = new SimpleXMLElement( $html );

$values = array_filter( array_values( (array) $dom ), function ( $i ) { return ! is_array( $i ); } );
$keys = array_filter( array_keys( (array) $dom ), function ( $i ) { return $i != '@attributes'; } );

print_r( $values ); // This is a h1, This is a Paragraph, This is h2
print_r( $keys ); // h1, p, h2

I used array_filter for remove div tag from result.