使用php Dom删除id的段落

I'm trying to find paragraphs with the id "test" and remove them from a html string, I've tried using php Dom Document but the html I'm searching is badly formed and I get errors

$caption = "blah blah<p id ='test'>Test message</p>";
$doc = new DOMDocument();
$doc->loadHTMLFile($caption);
$xmessage = $doc->getElementById('test');

returns Warning: DOMDocument::loadHTML() [domdocument.loadhtml]: Unexpected end tag : br i

Is there a way to suppress the warnings? Thanks

You can use following code to remove a para with id='test':

$caption = "blah blah<p id='test'>Test message</p><p id='foo'>Foo Bar</p>";
$doc = new DOMDocument();
$doc->loadHTML($caption);
$xpath = new DOMXPath($doc);
$nlist = $xpath->query("//p[@id='test']");
$node = $nlist->item(0);
echo "Para: [" . $node->nodeValue . "]
";
$node->parentNode->removeChild($node);
echo "Remaining: [" . $doc->saveHTML() . "]
";

OUTPUT:

Para: [Test message]
Remaining: [<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body>
<p>blah blah</p>
<p id="foo">Foo Bar</p>
</body></html>
]

Don't use loadHTMLFile() use loadHTML().

The latter expects HTML string, which is what you are providing. Doing so should correct the warning.

There's more than one paragraph with the same ID? Surely not...

It's generally bad practice (as the warnings are there for a reason), but you can suppress warnings using @, although i'm not 100% certain it works on function calls from a class like this, let me know if it does!

$caption = "blah blah<p id ='test'>Test message</p>";
$doc = new DOMDocument();
@$doc->loadHTMLFile($caption);
$xmessage = @$doc->getElementById('test');

getElementById requires the HTML to be validated before it'll work. See this StackOverflow answer for more info.

$caption = "blah blah<p id ='test'>Test message</p>";
$doc = new DOMDocument;
$doc->validateOnParse = true;  // validate HTML
$doc->loadHTML($caption);  // This loads an HTML string
$xmessage = $doc->getElementById('test');

(NOTE: You need to use loadHTML, not loadHTMLFile).

This still may not work, as the HTML may not be valid.

If this doesn't work, I suggest using DOMXPath.

$caption = "blah blah<p id ='test'>Test message</p>";
$doc = new DOMDocument;
$doc->loadHTMLFile($caption);
$xpath = new DOMXPath($doc);
$xmessage = $xpath->query("//p[@id='test']")->item(0);