使用GOMAXPROCS = 256运行无限循环goroutine时理解golang的调度程序

I'm playing with Go, running go 1.7.3 on my 2015 8-core MacBookPro.

Trying to make sense of how go scheduler works when runtime.GOMAXPROCS is set to it's max value (256) and same amount of goroutines are started, each running an infinite loop.

My assumption was that go runtime would spawn runtime.GOMAXPROCS number of OS threads (i.e. 256 threads) and run my goroutines in those threads.

I was expecting the following code to print me 256 1's:

func main() {
    procs := 256
    runtime.GOMAXPROCS(procs)
    for i := 0; i < procs; i++ {
        go func() {
            fmt.Print("1")
            for {}
        }()
    }
    for {}
}

This code prints various number of 1's every time it runs. Most of the time it prints 142 1's.

Now, there is runtime.Gosched() that manually invoked go scheduler. I was playing with it and found that I can get 256 1's printed only if I call runtime.Gosched() in both goroutines and main func:

func main() {
    procs := 256
    runtime.GOMAXPROCS(procs)
    for i := 0; i < procs; i++ {
        go func() {
            fmt.Print("1")
            for { runtime.Gosched() }
        }()
    }
    for { runtime.Gosched() }
}

Can someone explain why 256 1's are not printed by default and I need the runtime.Gosched()? Shouldn't we get 256 OS threads to run those 256 goroutines? And why do we need to call runtime.Gosched() in both places?

I think it's because there is a limit of threads you can create on you machine. What runtime.GOMAXPROCS(256) does is creates 256 Logical Processors (also known as P = Proccesor) and tries to run them. Each P has it's own runqueue of goroutines named G and are executed by OS thread (called M as Machine).

So what happens is that your 256 P's try to get an OS thread(M) to execute G, but your computer doesn't have that many resources to run 256 OS threads, so thats why you get only 143 ones.

For me executing this command produces 143 ones and the number of running threads for this process is about 150.

Because by default go scheduler won't preempt any goroutines you will only be running around 150 goroutines and all others will be starved of OS threads. Here is an issue describing this behaviour. Calling runtime.Gosched() yields the processor(P), allowing other goroutines to run.

To inspect what scheduler is doing you can add GODEBUG environment variable:

env GODEBUG=scheddetail=1,schedtrace=1000 ./cpu3

Here is output (Note that it uses P, M, G terminology):

SCHED 0ms: gomaxprocs=8 idleprocs=5 threads=5 spinningthreads=1 idlethreads=0 runqueue=0 gcwaiting=0 nmidlelocked=1 stopwait=0 sysmonwait=0
  P0: status=1 schedtick=0 syscalltick=0 m=3 runqsize=0 gfreecnt=0
  P1: status=1 schedtick=1 syscalltick=0 m=2 runqsize=0 gfreecnt=0
  P2: status=1 schedtick=0 syscalltick=0 m=4 runqsize=0 gfreecnt=0
  P3: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P4: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P5: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P6: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P7: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  M4: p=2 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 helpgc=0 spinning=true blocked=false lockedg=-1
  M3: p=0 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 helpgc=0 spinning=false blocked=false lockedg=-1
  M2: p=1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 helpgc=0 spinning=true blocked=false lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 helpgc=0 spinning=false blocked=false lockedg=-1
  M0: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 helpgc=0 spinning=false blocked=true lockedg=1
  G1: status=1(chan receive) m=-1 lockedm=0
  G2: status=4(force gc (idle)) m=-1 lockedm=-1
  G3: status=4(GC sweep wait) m=-1 lockedm=-1
1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111SCHED 1004ms: gomaxprocs=256 idleprocs=0 threads=150 spinningthreads=0 idlethreads=5 runqueue=0 gcwaiting=1 nmidlelocked=0 stopwait=143 sysmonwait=0
  P0: status=1 schedtick=1 syscalltick=3 m=0 runqsize=0 gfreecnt=0
  P1: status=1 schedtick=3 syscalltick=1 m=2 runqsize=0 gfreecnt=0
  ...
  P141: status=1 schedtick=3 syscalltick=1 m=143 runqsize=0 gfreecnt=0
  P142: status=1 schedtick=2 syscalltick=3 m=144 runqsize=0 gfreecnt=0
  P143: status=3 schedtick=1 syscalltick=38 m=-1 runqsize=0 gfreecnt=0
  ...
  P255: status=3 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  M149: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 helpgc=0 spinning=false blocked=true lockedg=-1
  ...
  M144: p=142 curg=181 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 helpgc=0 spinning=false blocked=false lockedg=-1
  M143: p=141 curg=177 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 helpgc=0 spinning=false blocked=false lockedg=-1
  M142: p=140 curg=179 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 helpgc=0 spinning=false blocked=false lockedg=-1
  ...
  M112: p=110 curg=186 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 helpgc=0 spinning=false blocked=false lockedg=-1
  ...