为什么在“热身”后解组JSON的速度更快?

I've noticed that when I time JSON unmarshalling in the context of a Go HTTP server, it takes 30,000+ nanoseconds even for small objects. This seemed large to me so I ran some isolated benchmarks, which surprisingly showed an average time of ~500 nanos per unmarshal. To dig into this more, I wrote a program that just does a series of unmarshals on the same small object, which showed that the first unmarshal is slow, and subsequent ones are much faster:

package main

import (
    "time"
    "fmt"
    "encoding/json"
)

var b []byte

type Dog struct {
    Age int `json:"Age"`
}

func unmarshalDog() {
    dogCopy := Dog{}

    start := time.Now().UnixNano()
    json.Unmarshal(b, &dogCopy)
    end := time.Now().UnixNano()
    fmt.Printf("Time to marshal/unmarshal: %d
", end-start)
}

func main() {
    // Marshal an object into a byte array which we will repeatedly unmarshal from
    d := Dog {
        Age: 5,
    }

    var err error
    b, err = json.Marshal(d)
    if err != nil {
        panic(err)
    }

    for i := 0; i < 20; i++ {
        unmarshalDog()
    }

    // Allow the goroutines to finish before terminating execution.
    time.Sleep(3 * time.Second)
}

Output:

$ go run testJSONPerformance.go
Time to marshal/unmarshal: 34127
Time to marshal/unmarshal: 1465
Time to marshal/unmarshal: 979
Time to marshal/unmarshal: 892
Time to marshal/unmarshal: 849
Time to marshal/unmarshal: 814
Time to marshal/unmarshal: 822
Time to marshal/unmarshal: 822
Time to marshal/unmarshal: 815
Time to marshal/unmarshal: 829
Time to marshal/unmarshal: 822
Time to marshal/unmarshal: 819

More interestingly, when I ran the same program but ran each call to unmarshalDog() in a separate goroutine via go unmarshalDog(), the warming up phenomenon disappeared and the average unmarshal time was much higher:

Time to marshal/unmarshal: 36540
Time to marshal/unmarshal: 4652
Time to marshal/unmarshal: 56959
Time to marshal/unmarshal: 3887
Time to marshal/unmarshal: 57068
Time to marshal/unmarshal: 3519
Time to marshal/unmarshal: 37160

I also tried a version where instead of unmarshaling the same byte array, I marshaled and unmarshaled a different object each time (in case there was some kind of runtime caching going on). The results were the same in this case.

I'm very curious to understand what's going on with this apparent "warming up." In the context of an HTTP server, each request gets a different goroutine so each unmarshal is on average quite slow. This seems suboptimal given that apparently Go has the potential to do the unmarshal in 1/50th of the time in a certain context. All intuition appreciated!

The JSON unmarshaller caches information on first unmarshal of any type. Subsequent unmarshal for the type use this cached information.

The cache is shared across goroutines, but there's no code to ensure that only one goroutine attempts to create the first cached value. Once one goroutine stores a value to the cache, all new comers will use that cached value.

The code for the cache is here.