Given the following code:
$html = "<h1>foo</h1><h2>bar</h2>";
$document = new DOMDocument();
$document->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($document);
$h1Nodes = $xpath->query('//h1');
foreach ($h1Nodes as $h1Node) {
var_dump($h1Node->nodeValue);
}
H1 tag contains only text node with the text 'foo'. Text 'bar' is in a sibling heading node (h2). I would expect the output to be 'foo'.
However, the output is 'foobar'.
Why?
Thank you, for your comment, hardik solanki.
It lead me to the answer: valid markup must have a root element.
Markup, which I've provided doesn't have one, and flags I've used prevent the library from adding one implicitly. So the first tag is treated as a root element and the result is a bit confusing.
Dropping those flags helps for this issue, but I am using them for a purpose. I just want to manipulate a snippet of HTML, and not a whole document. I want to get this snippet back (after transformations), by calling DOMDocument::saveHTML()
. Without doctype/<html>
/<body>
tags.
I've ended up doing this:
<html>
/<body>
tags to the HTML snippet I want to manipluate to have temporary a valid documentDOMDocument::saveHTML()
<html>
/<body>
tags markupIt works.