I have an xml file which may include other xml files relative to original file by using source attributes. Therefore the UnmarshalXML method should have access to the location of the original xml file. I extended the xml decoder and added the directory field. Structs that implement the ExtendedUnmarshaler interface have access to the directory.
The code below shows what I try to do. It only works when the root tag contains a source attribute because after calling DecodeElement the full document is parsed at once and I lose control over the child tags.
type ExtendedDecoder struct {
*xml.Decoder
cwd string // current working directory
}
func Unmarshal(data []byte, v interface{}, cwd string) error {
xmlDecoder := xml.NewDecoder(bytes.NewReader(data))
return ExtendedDecoder{xmlDecoder, cwd}.DecodeElement(v, nil)
}
func (d *ExtendedDecoder) DecodeElement(v interface{}, start *xml.StartElement) error {
v2, ok := v.(ExtendedUnmarshaler); if ok {
return v2.ExtendedUnmarshalXML(d, *start)
} else {
// Here lies the problem. I need to parse the element with DecodeElement but by
// doing so I lose control over all the sub elements.
return d.Decoder.DecodeElement(v, start)
}
}
type ExtendedUnmarshaler interface {
ExtendedUnmarshalXML(d *ExtendedDecoder, start xml.StartElement) error
}
type Tag struct {
Source string `xml:"source"`
}
func (t *Tag) ExtendedUnmarshalXML(d *ExtendedDecoder, start xml.StartElement) error {
d.Decoder.DecodeElement(t, &start)
if t.Source != "" {
sourcePath := filepath.Join(d.cwd, t.Source)
// Read file at sourcePath
}
return nil
}
func main() {
path := "path/to/file.xml"
data, _ := ioutil.ReadFile(path)
t := Tag{}
Unmarshal(data, t, filepath.Dir(path))
}
The errors are not handled because otherwise the code would be even longer.
Is there any way to make this code work for tags other than the root tag?
I would rather parse the source file the moment it is encountered because finding the tags with a source attribute does not seem very future proof. Every time a new tag with source attribute is added the function that finds the source attributes must be updated.
update: Looking at the source code, something like this could solve the problem;
func (d *ExtendedDecoder) DecodeElement(v interface{}, start *xml.StartElement) error {
v2, ok := v.(ExtendedUnmarshaler); if ok {
return v2.ExtendedUnmarshalXML(d, *start)
} else {
for {
t, err := d.Decoder.Token()
if err != nil {
return err
}
switch se := t.(type) {
case xml.StartElement:
err := d.DecodeElement(saveAny, &se)
if err != nil {
return err
}
case xml.EndElement:
return d.Decoder.DecodeElement(v, start)
}
}
}
}
Here "saveAny" is the struct corresponding to the child tag. Since almost all useful function in the xml package are not exported I have no idea how to get the value of saveAny