My use case is to transfer a group of members (integers) over network, so we employ delta encoding and on the receiving end we decode and put the whole list as a map, map[string]struct{} for O(1) complexity for membership check.
The problem I am facing is that the actual size of members is only 15MB for 2 Million integers, but the size of the map in heap is 100+MB. Seems like the actual map implementation of Go is not suitable for large maps. Since it is a client side SDK, I do not want to impact the usable memory much, and there can be multiple such groups that need to be kept in memory for long periods of time--around 1 week.
Is there a better alternative DS in Go for this?
type void struct{}
func ToMap(v []int64) map[string]void {
out := map[string]void{}
for _, i := range v {
out[strconv.Itoa(int(i))] = void{}
}
return out
}
This is a more memory efficient form of the map:
type void struct{}
func ToMap(v []int64) map[int64]void {
m := make(map[int64]void, len(v))
for _, i := range v {
m[i] = void{}
}
return m
}
Go maps are optimized for integer keys. Optimize the map allocation by giving the exact map size as a hint.
A string
has an implicit pointer which would make the garbage collector (gc) follow the pointer every time it scans.
Here is a Go benchmark for 2 million pseudorandom integers:
package main
import (
"math/rand"
"strconv"
"testing"
)
type void struct{}
func ToMap1(v []int64) map[string]void {
out := map[string]void{}
for _, i := range v {
out[strconv.Itoa(int(i))] = void{}
}
return out
}
func ToMap2(v []int64) map[int64]void {
m := make(map[int64]void, len(v))
for _, i := range v {
m[i] = void{}
}
return m
}
var benchmarkV = func() []int64 {
v := make([]int64, 2000000)
for i := range v {
v[i] = rand.Int63()
}
return v
}()
func BenchmarkToMap1(b *testing.B) {
b.ReportAllocs()
b.ResetTimer()
for N := 0; N < b.N; N++ {
ToMap1(benchmarkV)
}
}
func BenchmarkToMap2(b *testing.B) {
b.ReportAllocs()
b.ResetTimer()
for N := 0; N < b.N; N++ {
ToMap2(benchmarkV)
}
}
Output:
$ go test tomap_test.go -bench=.
BenchmarkToMap1-4 2 973358894 ns/op 235475280 B/op 2076779 allocs/op
BenchmarkToMap2-4 10 188489170 ns/op 44852584 B/op 23 allocs/op
$