In code where a global map with an expensive to generate value structure may be modified by multiple concurrent threads, which pattern is correct?
// equivalent to map[string]*activity where activity is a
// fairly heavyweight structure
var ipActivity sync.Map
// version 1: not safe with multiple threads, I think
func incrementIP(ip string) {
val, ok := ipActivity.Load(ip)
if !ok {
val = buildComplexActivityObject()
ipActivity.Store(ip, val)
}
updateTheActivityObject(val.(*activity), ip)
}
// version 2: inefficient, I think, because a complex object is built
// every time even through it's only needed the first time
func incrementIP(ip string) {
tmp := buildComplexActivityObject()
val, _ := ipActivity.LoadOrStore(ip, tmp)
updateTheActivity(val.(*activity), ip)
}
// version 3: more complex but technically correct?
func incrementIP(ip string) {
val, found := ipActivity.Load(ip)
if !found {
tmp := buildComplexActivityObject()
// using load or store incase the mapping was already made in
// another store
val, _ = ipActivity.LoadOrStore(ip, tmp)
}
updateTheActivity(val.(*activity), ip)
}
Is version three the correct pattern given Go's concurrency model?
Option 1 obviously can be called by multiple goroutines with a new ip
concurrently, and only the last one in the if
block would get stored. This possibility is greatly increased the longer buildComplexActivityObject
takes, as there is more time in the critical section.
Option 2 works, but calls buildComplexActivityObject
every time, which you state is not what you want.
Given that you want to call buildComplexActivityObject
as infrequently as possible, the third option is the only one that makes sense.
The sync.Map
however cannot protect the actual activity
values referenced by the stored pointers. You also need synchronization there when updating the activity
value.