Been scratching my head at this one for way too long now...
$dom = new DOMDocument();
$dom->loadHTML( $content );
$links = $dom->getElementsByTagName( 'a' )->item( 0 );
foreach ( $links->attributes as $attribute ) {
$name = $attribute->nodeName;
$value = str_replace( '"', '', stripslashes( $attribute->nodeValue ) );
echo "$name: $value<br />";
}
There is my code which I eventually got from: php dom get all attributes of a node. I've also tried other methods such as calling getAttribute() for a single attribute to see if that would work, but got the same result.
The HTML I am attempting to go through is simply:
<a id="testid" title="testtitle" name="this is a testname" href="http://example.com/">link!</a>
I'm getting the following error:
Warning: DOMDocument::loadHTML() [domdocument.loadhtml]: error parsing attribute name in Entity, line: 1
My script is outputting:
id: testid
title: testtitle
name: this
is:
a:
testname:
href: http://example.com/
I should add that the output works fine if the 'name' attribute is one word.
So obviously, it must be using explode() or something stupid on spaces. Is there a way to get around this without converting all spaces to %20 or something (I have plenty of other content beyond the links and wouldn't want to convert a whole block of content)?
As noted in the comments, the name
attribute shares the same space as the id
attribute, which is defined as a "NAME token", which are restricted to letters, numbers, dashes, underscores, periods and colons.
You'll note there are no spaces permitted in that list.
Some versions of the DOMDocument parser that PHP uses are super-strict about HTML compliance, and will whine and regularly do wrong things when confronted with spec violations. This may be one of those cases. Remove the spaces from your name attribute and see if you continue to see the problem.