Golang:为什么os.Exit在goroutines中不起作用

I have a research program with very simple algorithm. When success is coming goroutine should be close (end) via os.Exit(0). I'm wait one day, two day.... What? :)

Here is the simple code

package main

import "os"

func main() {
    for {
        go func() { os.Exit(0) }()
    }
}

And my questions:

  1. Why os.Exit doesn't terminate the goroutine?
  2. What is correct way to terminate (stop) goroutine execute?

Playground: http://play.golang.org/p/GAeOI-1Ksc

You terminate a goroutine by returning from the function. If you need to ensure that the goruotine runs to completion, you need to wait for the goroutine to complete. This is usually done with a sync.WaitGroup, or by synchronizing the goroutines via channels.

In your example, you first need to ensure that there is no possibility of spawning an infinite number of goroutines. Because there's no synchronization points between main and the new goroutines, there's no guarantee that any of them will execute the os.Exit call while the main loop is running.

The usual way of waiting for any number of goroutines to complete is to use a sync.WaitGroup, which will ensure they have all executed before main exits.

wg := sync.WaitGroup{}
for i := 0; i < 10000; i++ {
    wg.Add(1)
    go func() { defer wg.Done() }()
}

wg.Wait()
fmt.Println("done")

You've run into a sticky corner of the Go scheduler. The answer is that os.Exit does cause the entire process to exit, but the way you had it, the goroutines were never running.

What probably happened was that the for loop kept adding new goroutines to the list of available goroutines, but since the entire process was only running in one OS thread, the Go scheduler never got around to actually scheduling a different goroutine, and just kept running that for loop without ever running any of the goroutines you'd spawned. Try this instead:

package main

import "os"

func main() {
    for {
        go func() { os.Exit(0) }()
        func() {}()
    }
}

If you run it on the Go Playground, it should work (in fact, here's a link).

OK, the fact that the above code works while yours doesn't should be pretty mysterious. The reason this works is that the Go scheduler is actually non-preempting. What that means is that unless a given goroutine voluntarily decides to give the scheduler the option to run something else, nothing else will run.

Now obviously you've never written code that includes commands to give the scheduler a chance to run. What happens is that when your code is compiled, the Go compiler automatically inserts these into your code. And here's the key to why the above code works: one of the times that a goroutine might decide to run the scheduler is when a function is called. So by adding the func() {}() call (which obviously does nothing), we've allowed the compiler to add in a call to the scheduler, giving this code the opportunity to schedule different goroutines. Thus, one of the spawned goroutines runs, calls os.Exit, and the process exits.

EDIT: The function call itself may not be sufficient in the event that the compiler inlines the call (or, in this case, removes it entirely since it does nothing). runtime.Gosched(), on the other hand, is guaranteed to work.

Implement a deadhand or kill switch

package main

import (
        "fmt"
        "time"
        "os"
)

const maxNoTickle = 50          // will bail out after this many no tickles
const maxWorking = 20           // pretendWork() will tickle this many times
const deadTicks = 250           // milliseconds for deadHand() to check for tickles
const reportTickles = 4         // consecutive tickles or no tickles to print something

var (
        tickleMe bool           // tell deadHand() we're still alive
        countNoTickle int       // consecutive no tickles
        countGotTickle int      // consecutive tickles
)

/**
*       deadHand() - callback to kill program if nobody checks in after some period
*/
func deadHand() {
        if !tickleMe {
                countNoTickle++
                countGotTickle = 0
                if countNoTickle > maxNoTickle {
                        fmt.Println("No tickle max time reached. Bailing out!")
                        // panic("No real panic. Just checking stack")
                        os.Exit(0)
                }
                if countNoTickle % reportTickles == 0 {
                        // print dot for consecutive no tickles
                        fmt.Printf(".")
                }
        } else {
                countNoTickle = 0
                countGotTickle++
                tickleMe = false        // FIXME: might have race condition here
                if countGotTickle % reportTickles == 0 {
                        // print tilda for consecutive tickles
                        fmt.Printf("~")
                }
        }
        // call ourselves again
        time.AfterFunc(deadTicks * time.Millisecond, deadHand)
}

/**
*       init() - required to start deadHand
*/
func init() {
        time.AfterFunc(250 * time.Millisecond, deadHand)
        tickleMe = true
}

/**
*       pretendWork() - your stuff that does its thing
*/
func pretendWork() {
        for count := 0; count < maxWorking; count++ {
                tickleMe = true // FIXME: might have race condition here
                // print W pretending to be busy
                fmt.Printf("W")
                time.Sleep(100 * time.Millisecond)
        }
}

func main() {
        go workTillDone()
        for {
                // oops, program went loop-d-loopy
        }
}