I'm just wondering if there is potential for corruption as a result of writing the same value to a global variable at the same time. My brain is telling me there is nothing wrong with this because its just a location in memory, but I figure I should probably double check this assumption.
I have concurrent processes writing to a global map var linksToVisit map[string]bool
. The map is actually tracking what links on a website need to be further crawled.
However it can be the case that concurrent processes may have the same link on their respective pages and therefore each will mark that same link as true
concurrently. There's nothing wrong with NOT using locks in this case right? NOTE: I never change the value back to false
so either the key exists and it's value is true or it doesn't exist.
I.e.
var linksToVisit = map[string]bool{}
...
// somewhere later a goroutine finds a link and marks it as true
// it is never marked as false anywhere
linksToVisit[someLink] = true
What happens if concurrent processes write to a global variable the same value?
The results of a data race are undefined.
Run the Go data race detector.
References:
Benign Data Races: What Could Possibly Go Wrong?
The Go Blog: Introducing the Go Race Detector
In Go 1.6, the runtime added lightweight, best-effort detection of concurrent misuse of maps. This release improves that detector with support for detecting programs that concurrently write to and iterate over a map.
As always, if one goroutine is writing to a map, no other goroutine should be reading (which includes iterating) or writing the map concurrently. If the runtime detects this condition, it prints a diagnosis and crashes the program. The best way to find out more about the problem is to run the program under the race detector, which will more reliably identify the race and give more detail.
For example,
package main
import "time"
var linksToVisit = map[string]bool{}
func main() {
someLink := "someLink"
go func() {
for {
linksToVisit[someLink] = true
}
}()
go func() {
for {
linksToVisit[someLink] = true
}
}()
time.Sleep(100 * time.Millisecond)
}
Output:
$ go run racer.go
fatal error: concurrent map writes
$
$ go run -race racer.go
==================
WARNING: DATA RACE
Write at 0x00c000078060 by goroutine 6:
runtime.mapassign_faststr()
/home/peter/go/src/runtime/map_faststr.go:190 +0x0
main.main.func2()
/home/peter/gopath/src/racer.go:16 +0x6a
Previous write at 0x00c000078060 by goroutine 5:
runtime.mapassign_faststr()
/home/peter/go/src/runtime/map_faststr.go:190 +0x0
main.main.func1()
/home/peter/gopath/src/racer.go:11 +0x6a
Goroutine 6 (running) created at:
main.main()
/home/peter/gopath/src/racer.go:14 +0x88
Goroutine 5 (running) created at:
main.main()
/home/peter/gopath/src/racer.go:9 +0x5b
==================
fatal error: concurrent map writes
$
Concurrent map write is not ok, so you will most likely get a fatal error. So I think a lock should be used
It is better to use locks if you are changing the same value concurrently using multiple go routines. Since mutex and locks are used whenever it comes to secure the value from accessing when another function is changing the same just like writing to database table while accessing the same table.
For your question on using maps with different keys it is not preferable in Go as:
The typical use of maps did not require safe access from multiple goroutines, and in those cases where it did, the map was probably part of some larger data structure or computation that was already synchronized. Therefore requiring that all map operations grab a mutex would slow down most programs and add safety to few.
Map access is unsafe only when updates are occurring. As long as all goroutines are only reading—looking up elements in the map, including iterating through it using a for range loop—and not changing the map by assigning to elements or doing deletions, it is safe for them to access the map concurrently without synchronization.
So In case of update of maps it is not recommended. For more information Check FAQ on why maps operations not defined atomic.
Also it is noticed that if you realy wants to go for there should be a way to synchronize them.
Maps are not safe for concurrent use: it's not defined what happens when you read and write to them simultaneously. If you need to read from and write to a map from concurrently executing goroutines, the accesses must be mediated by some kind of synchronization mechanism. One common way to protect maps is with sync.RWMutex.
As of Go 1.6, simultaneous map writes will cause a panic
. Use a sync.Map
to synchronize access.
See the map value assign implementation: https://github.com/golang/go/blob/fe8a0d12b14108cbe2408b417afcaab722b0727c/src/runtime/hashmap.go#L519