I discovered very strange behaviour with go maps recently. The use case is to create a group of integers and have O(1) check for IsMember(id int).
The current implementation is :
func convertToMap(v []int64) map[int64]void {
out := make(map[int64]void, len(v))
for _, i := range v {
out[i] = void{}
}
return out
}
type Group struct {
members map[int64]void
}
type void struct{}
func (g *Group) IsMember(input string) (ok bool) {
memberID, _ := strconv.ParseInt(input, 10, 64)
_, ok = g.members[memberID]
return
}
When i benchmark the IsMember method, until 6 million members, everything looks fine. But above that the map look up is taking 1 second for each lookup!!
The benchmark test:
func BenchmarkIsMember(b *testing.B) {
b.ReportAllocs()
b.ResetTimer()
g := &Group{}
g.members = convertToMap(benchmarkV)
for N := 0; N < b.N && N < sizeOfGroup; N++ {
g.IsMember(benchmarkKVString[N])
}
}
var benchmarkV, benchmarkKVString = func(size int) ([]int64, []string{
v := make([]int64, size)
s := make([]string, size)
for i := range v {
val := rand.Int63()
v[i] = val
s[i] = strconv.FormatInt(val, 10)
}
return v, s
}(sizeOfGroup)
Benchmark numbers:
const sizeOfGroup = 6000000
BenchmarkIsMember-8 2000000 568 ns/op 50 B/op 0 allocs/op
const sizeOfGroup = 6830000
BenchmarkIsMember-8 1 1051725455 ns/op 178767208 B/op 25 allocs/op
Anything above group size of 6.8 million gives the same result.
Can someone help me to explain why this is happening, and can anything be done to make this performant while still using maps?
Also, i dont understand why so much memory is being allocated? Even if the time taken is due to collision and then linked list traversal, there shouldn't be any mem allocation, is my thought process wrong?
No need to measure extra allocation for converting slice to map because we just want to measure the lookup operation.
I've slightly modify the benchmark:
func BenchmarkIsMember(b *testing.B) {
fn := func(size int) ([]int64, []string) {
v := make([]int64, size)
s := make([]string, size)
for i := range v {
val := rand.Int63()
v[i] = val
s[i] = strconv.FormatInt(val, 10)
}
return v, s
}
for _, size := range []int{
6000000,
6800000,
6830000,
60000000,
} {
b.Run(fmt.Sprintf("size=%d", size), func(b *testing.B) {
var benchmarkV, benchmarkKVString = fn(size)
g := &deltaGroup{}
g.members = convertToMap(benchmarkV)
b.ReportAllocs()
b.ResetTimer()
for N := 0; N < b.N && N < size; N++ {
g.IsMember(benchmarkKVString[N])
}
})
}
}
And got the following results:
go test ./... -bench=. -benchtime=10s -cpu=1
goos: linux
goarch: amd64
pkg: trash
BenchmarkIsMember/size=6000000 2000000000 0.55 ns/op 0 B/op 0 allocs/op
BenchmarkIsMember/size=6800000 1000000000 1.27 ns/op 0 B/op 0 allocs/op
BenchmarkIsMember/size=6830000 1000000000 1.23 ns/op 0 B/op 0 allocs/op
BenchmarkIsMember/size=60000000 100000000 136 ns/op 0 B/op 0 allocs/op
PASS
ok trash 167.578s
Degradation isn't so significant as in your example.