I'm trying to unmarshal an XML feed containing German characters (e.g. ß, ä, Ö, ü, etc) into a struct, which results in the error: xml: encoding "utf-16" declared but Decoder.CharsetReader is nil unmarshal successful
Basically this is what I'm doing (omitted error checking for the parts that work):
resp, _ := http.Get(url)
defer resp.Body.Close()
bodyBytes, _ := ioutil.ReadAll(resp.Body)
err = xml.Unmarshal(bodyBytes, &target)
if err != nil {
fmt.Println(err)
}
I've tried to convert xml to json using github.com/basgys/goxml2json
, I've tried to convert to string and back to []byte before unmarshal, I've tried to use various decoders posted in other SO answers (since it say charset reader is nil), such as:
reader := bytes.NewReader(bodyBytes)
decoder := xml.NewDecoder(reader)
decoder.CharsetReader = charset.NewReader
err = decoder.Decode(&target)
if err != nil {
fmt.Println(err)
}
No matter what I've tried it fails to unmarshal/decode the xml feed into the struct.. in some cases it ends up converting all the text to Chinese rather than German.
If the charset.NewReader
you're using in the second example is from https://godoc.org/golang.org/x/net/html/charset then the code shouldn't even compile since the CharsetReader
field has a different signature from NewReader
.
To fix the error you can provide an "identical" charset reader, that is, one that returns the input unchanged.
func identReader(encoding string, input io.Reader) (io.Reader, error) {
return input, nil
}
// ...
decoder.CharsetReader = identReader
https://play.golang.org/p/BiU4T2qz1Z1
NOTE: the above solution works for the example characters from the question but it may very well fail for other utf16 strings. In such a case a custom charset reader that can convert utf16 to utf8 should be provided instead of the identReader
.