I'm using go's encoding/xml
package to parse an XML file. When parsing a file, here's the error I get :
XML syntax error on line 16: invalid character entity ü
But the xml file references a dtd :
<!DOCTYPE dblp SYSTEM "dblp.dtd">
And that dtd itself contains the definition of that entity :
<!ENTITY uuml "ü" ><!-- small u, dieresis or umlaut mark -->
Is there a way to force Go's xml parser to parse DTDs, did I miss something somewhere or am I doomed to use a third-party xml parser ?
Probably not the answer you would like to hear...
You could use the Entity
-field of http://golang.org/pkg/encoding/xml/#Decoder
. Unfortunately I do not know of an automatic way to generate such entity maps from a dtd. But this should be straightforward to extract from the dtd. If the dtd doesn't change this might be a nice task for go generate
.
If the list if entities is fixed (and small enough) I would hardcode the entity map.