From http://golang.org/pkg/encoding/xml/#Unmarshal
- If the XML element contains a sub-element that hasn't matched any of the above rules and the struct has a field with tag ",any", unmarshal maps the sub-element to that struct field.
I'm having trouble getting the remainder of an XML envelope into my struct (to show that I have an incomplete mapping)
http://play.golang.org/p/mnFqAcguJQ
I know you can use exactly this method with bson.M from the mgo packages using ,inline - but it looks like map[string]interface{} isn't the answer here.
EDIT: After some additional playing, I've found what I believe to be some additional unexpected behavior.
Switching to []string as a type starts to accept input, but no key/value pairs: http://play.golang.org/p/wCAJeeQa4m
I also planned on adapting encode/xml in order to parse html. I do not see in the documentation that if an element exists more than once, it will save the last instance of it, rather than erroring out: http://play.golang.org/p/0MY__R-Xi3
Here: http://play.golang.org/p/iY8YlxYym0
Since c
is something concrete, it shouldn't use ",any"
, hence it should have a struct definition. C
itself contains a list of arbitrary tags, hence it should contain an []Tag xml:'",any"'
... now to capture the Tag
itself, you need xml.Name to get the tag name and something with ",innerxml".
Finally the result is this:
const xmlString = `<foo><a>1</a><b>2</b><c><c1>3</c1><c2>4</c2></c></foo>`
type Foo struct {
A int `xml:"a"`
B int `xml:"b"`
C Extra `xml:"c"`
}
type Extra struct {
Items []Tag `xml:",any"`
}
type Tag struct {
XMLName xml.Name
Content string `xml:",innerxml"`
}
Or the shorter version:
type Foo struct {
A int `xml:"a"`
B int `xml:"b"`
C struct {
Items []struct {
XMLName xml.Name
Content string `xml:",innerxml"`
} `xml:",any"`
} `xml:"c"`
}
For HTML there is go.net/html. Using xml parser for html will be complicated.