如何解析PHP代码中的html内容? [重复]

This question already has an answer here:

How can I parse the content of each header in the below code separately.

$str="<html>
          <body>
          <h1>Java</h1>
          <p>Java is a platform independent object oriented programming language</p>
          <h2>Html</h2>
          <p>HTML is a markup language for describing web documents</p>
          <h3>Php</h3>
          <p>Php is simple</p>
          </body>
          </html>";
</div>

Using DOMDocument & XPath it is quite straightforward.

$str="
<html>
    <body>
        <h1>Java</h1>
        <p>Java is a platform independent object oriented programming language</p>
        <h2>Html</h2>
        <p>HTML is a markup language for describing web documents</p>
        <h3>Php</h3>
        <p>Php is simple</p>
    </body>
</html>";

$dom=new DOMDocument;
$dom->loadHTML( $str );
$xp=new DOMXPath( $dom );
$col=$xp->query('//h1|//h2|//h3');
foreach( $col as $node ) echo $node->nodeValue;