为什么在使用范围时为什么看到某些尺寸的地图变慢?

On my computer I am seeing a read per second drop when I hit certain sizes of maps, but it does not degrade in a linear fashion. In fact, the performance drops off immediately and then comes back slowly as the size increases:

$ go run map.go 425984 1 425985
    273578 wps ::   18488800 rps
    227909 wps ::    1790311 rps

$ go run map.go 400000 10000 500000
    271355 wps ::   18060069 rps
    254804 wps ::   18404288 rps
    267067 wps ::   18673778 rps
    216442 wps ::    1984859 rps
    246724 wps ::    2461281 rps
    282316 wps ::    3634125 rps
    216615 wps ::    4989007 rps
    276769 wps ::    6972233 rps
    212019 wps ::    9756720 rps
    286027 wps ::   14488593 rps
    227073 wps ::   17309822 rps

I expected the writes to slow down occasionally (as the underlying data structure was resized), but the reads being sensitive to size (and by an order of magnitude) surprised me.

Here is the code I am using to test this:

package main

import (
    "bytes"
    "fmt"
    "math/rand"
    "os"
    "strconv"
    "time"
)

func main() {
    start, _ := strconv.ParseInt(os.Args[1], 10, 64)
    step, _ := strconv.ParseInt(os.Args[2], 10, 64)
    end, _ := strconv.ParseInt(os.Args[3], 10, 64)
    for n := start; n <= end; n += step {
        runNTimes(n)
    }
}

func randomString() string {
    var b bytes.Buffer

    for i := 0; i < 16; i++ {
        b.WriteByte(byte(0x61 + rand.Intn(26)))
    }

    return b.String()
}

func perSecond(end time.Time, start time.Time, n int64) float64 {
    return float64(n) / end.Sub(start).Seconds()
}

func runNTimes(n int64) {
    m := make(map[string]int64)

    startAdd := time.Now()
    for i := int64(0); i < n; i++ {
        m[randomString()]++
    }
    endAdd := time.Now()

    totalInMap := int64(0)
    startRead := time.Now()
    for _, v := range m {
        //get around unused variable error,
        //v should always be > 0
        if v != 0 {
            totalInMap++
        } 
    }
    endRead := time.Now()

    fmt.Printf("%10.0f wps :: %10.0f rps
",
        perSecond(endAdd, startAdd, n),
        perSecond(endRead, startRead, totalInMap),
    )
}

Your code does not measure map performance per se. Your code measures a strange mix of performing random number generation (not guaranteed to be constant time operation), ranging a map (not guaranteed to be a constant time operation and not guaranteed to be in any predictable way related to plain map access performance) and perhaps it even measures interfering "stop-the-world" garbage collecting.

  • Don't write your own bench functionality, use what package "testing" offers, it's much better. For example, it never relies on sample size == 1 (like your code wrongly does).
  • Also, generate all the test keys outside of measured time.
  • Then, to minimize the GC effects, perform runtime.GC().
  • Now finaly use B.StartTimer and execute the measured code path.

Anyway, whatever you are going to properly measure won't be too much useful either. The map code is an implementation detail of the Go runtime and can change any time. AFAIK, the current implementation is completely different than what was in Go 1.

And finally: Yes, well tuned map implementation is expected to be potentially sensitive to many things, including the number and/or size and/or type of items in it - even the architecture and CPU stepping and cache size can play a role in this.