I am currently using the PHP DOM to get the BODY tag from HTML.
$doc = new DOMDocument();
$doc->loadHTML($HTML);
$body = preg_replace("/.*<body[^>]*>|<\/body>.*/si", "", $HTML);
The above code completely gives me the html from the body tag for a given HTML.
Can I get the HTML tags with $body
as an array?
If possible, I would use DOM - it will make your solution a lot more reliable and cleaner to use.
This should get your headed in the right direction (I'm not writing the solution for you, sorry):
$html = file_get_contents("http://google.com");
$dom = new DOMdocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$elements = $xpath->query("//*");
foreach ($elements as $element) {
echo "<h1>". $element->nodeName. "</h1>";
$nodes = $element->childNodes;
foreach ($nodes as $node) {
echo "<h2>".$node->nodeName. "</h2>";
echo $node->nodeValue. "
";
}
}