I am trying to repro an issue and came to a minimum use case with the following code. If I close all the channels (bypassing the i == 0 test), things are working as expected. Wg state decrements and done is triggered, main exits fine. When I skip closing one of these channel (on purpose), I expect the main routine to wait while the waitgroup semaphore will block indefinitely in this case. Instead, I am getting an error: "fatal error: all goroutines are asleep - deadlock!". Why is that? I must have missed something fundamental or this the runtime being overzealous?
package main
import (
"fmt"
"sync"
)
const N int = 4
func main() {
done := make(chan struct{})
defer close(done)
fmt.Println("Beginning...")
chans := make([]chan int, N)
var wg sync.WaitGroup
for i := 0; i < N; i++ {
wg.Add(1)
chans[i] = make(chan int)
go func(i int) { // p0
defer wg.Done()
for m := range chans[i] {
fmt.Println("Received ", m)
}
fmt.Println("Ending p", i)
}(i)
}
go func() {
wg.Wait()
done <- struct{}{} // signal main that we are done
}()
for i := 0; i < N; i++ {
fmt.Println("Closing c", i)
if i != 0 { // Skip #0 so wg doesn't reach '0'
close(chans[i])
}
}
<-done // wait to receive signal from anonymous join function
fmt.Println("Ending.")
}
UPDATE: I edited the code to avoid the race condition. Still getting this error.
The if i != 0
is there because it's intentional. I want the wg.Wait to block forever (with its semaphore never reaching 0.) Why can't I do that? It seems the same as if I were using <-done
without a matching done <- struct{}{}
somewhere else. Would the compiler complain too in that case?
Here's what's going on:
go func(i int) {
goroutine does not exit because chans[0]
is not closed.wg.Done
is not called.wg.Wait()
blocks forever because of the previous point.done
.You can fix the deadlock by removing the if i != 0 {
, but there is another issue. There is a race on the wait group. It's possible that wg.Done() is called before wg.Add(1) is called. Call wg.Add() before starting the goroutine to avoid the race.
The if
statement in your for loop doesn't let the last channel close, so your goroutine
is left waiting on something to happen to chans[i]
which will block the defer wg.Done()
from ever happening which in turn will never let wg.Wait()
finish WHICH THENNNNN will never let done <- struct{}{}
get signalled
So in short, your if statement
in your loop is not closing the last channel and causing a deadlock because nobody can do nothing.
As @CodingPickle did point out, move your wg.Add(1)
to the beginning of your for loop
to prevent any race conditions