I have a gRPC server, and I have implemented graceful shutdown of my gRPC server something like this
fun main() {
//Some code
term := make(chan os.Signal)
go func() {
if err := grpcServer.Serve(lis); err != nil {
term <- syscall.SIGINT
}
}()
signal.Notify(term, syscall.SIGTERM, syscall.SIGINT)
<-term
server.GracefulStop()
closeDbConnections()
}
This works fine. If instead I write the grpcServer.Serve()
logic in main goroutine and instead put the shutdown handler logic into another goroutine, statements after server.GracefulStop()
usually do not execute. Some DbConnections are closed, if closeDbConnections()
is executed at all.
server.GracefulStop()
is a blocking call. Definitely grpcServer.Serve()
finishes before server.GracefulStop()
completes. So, how long does main goroutine take to stop after this call returns?
The problematic code
func main() {
term := make(chan os.Signal)
go func() {
signal.Notify(term, syscall.SIGTERM, syscall.SIGINT)
<-term
server.GracefulStop()
closeDbConnections()
}()
if err := grpcServer.Serve(lis); err != nil {
term <- syscall.SIGINT
}
}
This case does not work as expected. After server.GracefulStop()
is done, closeDbConnections()
may or may not run (usually does not run to completion). I was testing the later case by sending SIGINT by hitting Ctrl-C from my terminal.
Can someone please explain this behavior?
I'm not sure about your question (please clarify it), but I would suggest you to refactor your main
in this way:
func main() {
// ...
errChan := make(chan error)
stopChan := make(chan os.Signal)
// bind OS events to the signal channel
signal.Notify(stopChan, syscall.SIGTERM, syscall.SIGINT)
// run blocking call in a separate goroutine, report errors via channel
go func() {
if err := grpcServer.Serve(lis); err != nil {
errChan <- err
}
}()
// terminate your environment gracefully before leaving main function
defer func() {
server.GracefulStop()
closeDbConnections()
}()
// block until either OS signal, or server fatal error
select {
case err := <-errChan:
log.Printf("Fatal error: %v
", err)
case <-stopChan:
}
I don't think it's a good idea to mix system events and server errors, like you do in your example: in case if Serve
fails, you just ignore the error and emit system event, which actually didn't happen. Try another approach when there are two transports (channels) for two different kind of event that cause process termination.