I have a bunch of JSON files, each containing a very large array of complex data. The JSON files look something like:
ids.json
{
"ids": [1,2,3]
}
names.json:
{
"names": ["Tyrion","Jaime","Cersei"]
}
and so on. (In reality, the array elements are complex struct objects with 10s of fields)
I want to extract just the tag that specifies what kind of array it contains. Currently I'm using encoding/json
to unmarshal the whole file into a map[string]interface{}
and iterate through the map but that is too costly an operation.
Is there a faster way of doing this, preferably without the involvement of unmarshaling entire data?
You can offset the reader right after the opening curly brace then use json.Decoder
to decode only the first token from the reader
Something along these lines
sr := strings.NewReader(`{
"ids": [1,2,3]
}`)
for {
b, err := sr.ReadByte()
if err != nil {
fmt.Println(err)
return
}
if b == '{' {
break
}
}
d := json.NewDecoder(sr)
var key string
err := d.Decode(&key)
if err != nil {
fmt.Println(err)
return
}
fmt.Println(key)
https://play.golang.org/p/xJJEqj0tFk9
Additionally you may wrap your io.Reader
you obtained from open
with bufio.Reader
to avoid multiple single-byte writes
This solution assumes contents is a valid JSON object. Not that you could avoid that anyway.
I had a play around with Decoder.Token()
reading one token at a time (see this example, line 87), and this works to extract your array label:
const jsonStream = `{
"ids": [1,2,3]
}`
dec := json.NewDecoder(strings.NewReader(jsonStream))
t, err := dec.Token()
if err != nil {
log.Fatal(err)
}
fmt.Printf("First token: %v
", t)
t, err = dec.Token()
if err != nil {
log.Fatal(err)
}
fmt.Printf("Second token (array label): %v
", t)