As part of scaling pods in kubernetes I want to ensure I gracefully serve my http connections before shutting down. To that extent I have implemented this code in go:
package main
import (
"fmt"
"io"
"net/http"
"os"
"os/signal"
"syscall"
"github.com/braintree/manners"
)
func main() {
shutdown := make(chan int)
//create a notification channel to shutdown
sigChan := make(chan os.Signal, 1)
//start the http server
http.HandleFunc("/", hello)
server := manners.NewWithServer(&http.Server{Addr: ":80", Handler: nil})
go func() {
server.ListenAndServe()
shutdown <- 1
}()
//register for interupt (Ctrl+C) and SIGTERM (docker)
signal.Notify(sigChan, os.Interrupt, syscall.SIGTERM)
go func() {
<-sigChan
fmt.Println("Shutting down...")
server.Close()
}()
<-shutdown
}
func hello(w http.ResponseWriter, r *http.Request) {
// time.Sleep(3000 * time.Millisecond)
io.WriteString(w, "Hello world!")
}
This looks out for the docker SIGTERM and gracefully shuts down after existing requests have been served. When I run this container in kubernetes with 10 instances I can scale up and down without incident, as long as I don't scale down to a single instance. When I scale to a single instance I see a short set of http errors, then all looks fine again.
I find it strange as in scaling I would assume the proxy is updated first, then containers are shut down and the code above would allow requests to be served out.
In my current setup I am running 2 nodes, maybe the issue is when scaling drops below the number of nodes and there is some sort of timing issue with etcd updates? Any insight into what is going on here would be really useful
You should use a readiness check (http://kubernetes.io/v1.0/docs/user-guide/production-pods.html#liveness-and-readiness-probes-aka-health-checks)
that transitions the Pod to "not ready" after you receive a SIGTERM
Once that happens, the service will remove the Pod from serving, prior to the delete.
(without a readiness check the Service simply doesn't know that the pod doesn't exist, until it is actually deleted)
You may also want to use a PreStop hook that sets readiness to false, and then drains all existing requests. PreStop hooks are called synchronously prior to a Pod being deleted and they are described here:
There is a small window during which a pod that is being removed but is still alive will be part of the load-balancing set. As Brendan just said (he beat me by seconds), a readiness check should fix this for you completely under your control.