pprof输出中的runtime.adjustdefers是什么意思?

We are running a Go program that spent most of the time doing GC. We took a memory profile and I did a 'go tool pprof -alloc_objects'. I then did a 'top5' in the pprof console and following is what it shows:

My question is, what does runtime.adjustdefers mean?

(pprof) top5
4576708929 of 7330217181 total (62.44%)
Dropped 765 nodes (cum <= 36651085)
Showing top 5 nodes out of 88 (cum >= 970919101)
      flat  flat%   sum%        cum   cum%
2035058528 27.76% 27.76% 2035058528 27.76%  runtime.adjustdefers
 996366409 13.59% 41.36% 1284278077 17.52%  github.com/pelletier/go-buffruneio.init
 627682563  8.56% 49.92%  916069310 12.50%  github.com/prometheus/common/expfmt.MetricFamilyToText
 509166106  6.95% 56.86%  509166106  6.95%  encoding/csv.(*Reader).ReadAll
 408435323  5.57% 62.44%  970919101 13.25%  golang.org/x/net/html.init

The Go Programming Language Specification

Defer statements

A "defer" statement invokes a function whose execution is deferred to the moment the surrounding function returns, either because the surrounding function executed a return statement, reached the end of its function body, or because the corresponding goroutine is panicking.


go/src/runtime/stack.go:

func adjustdefers(gp *g, adjinfo *adjustinfo) {
    // Adjust defer argument blocks the same way we adjust active stack frames.
    tracebackdefers(gp, adjustframe, noescape(unsafe.Pointer(adjinfo)))

    // Adjust pointers in the Defer structs.
    // Defer structs themselves are never on the stack.
    for d := gp._defer; d != nil; d = d.link {
        adjustpointer(adjinfo, unsafe.Pointer(&d.fn))
        adjustpointer(adjinfo, unsafe.Pointer(&d.sp))
        adjustpointer(adjinfo, unsafe.Pointer(&d._panic))
    }
}

go/src/runtime/stack.go:

// Copies gp's stack to a new stack of a different size.
// Caller must have changed gp status to Gcopystack.
//
// If sync is true, this is a self-triggered stack growth and, in
// particular, no other G may be writing to gp's stack (e.g., via a
// channel operation). If sync is false, copystack protects against
// concurrent channel operations.
func copystack(gp *g, newsize uintptr, sync bool) {
    // . . .
    // allocate new stack
    new := stackalloc(uint32(newsize))
    if stackPoisonCopy != 0 {
        fillstack(new, 0xfd)
    }
    // . . .
    // Compute adjustment.
    var adjinfo adjustinfo
    adjinfo.old = old
    adjinfo.delta = new.hi - old.hi
    // . . .
    // Adjust remaining structures that have pointers into stacks.
    // We have to do most of these before we traceback the new
    // stack because gentraceback uses them.
    adjustctxt(gp, &adjinfo)
    adjustdefers(gp, &adjinfo)
    adjustpanics(gp, &adjinfo)
    if adjinfo.sghi != 0 {
        adjinfo.sghi += adjinfo.delta
    }
    // . . .
}

From my reading of the code, when a goroutine stack is resized adjustdefers makes pointer adjustments for deferred functions.


You say that you are "running a Go program that spent most of the time doing GC." The second highest package is github.com/pelletier/go-buffruneio. The code looks inefficient. Here's a simple benchmark for reading runes.

package main

import (
    "bufio"
    "bytes"
    "io"
    "testing"

    "github.com/pelletier/go-buffruneio"
)

var buf = make([]byte, 64*1024)

func BenchmarkBuffruneio(b *testing.B) {
    b.ReportAllocs()
    for i := 0; i < b.N; i++ {
        r := buffruneio.NewReader(bytes.NewBuffer(buf[:cap(buf)]))
        for {
            rune, _, err := r.ReadRune()
            if err == io.EOF || rune == buffruneio.EOF {
                break
            }
        }
    }
}

func BenchmarkBufio(b *testing.B) {
    b.ReportAllocs()
    for i := 0; i < b.N; i++ {
        r := bufio.NewReader(bytes.NewBuffer(buf[:cap(buf)]))
        for {
            _, _, err := r.ReadRune()
            if err == io.EOF {
                break
            }
        }
    }
}

Output:

$ go test -v -bench=.
goos: linux
goarch: amd64
pkg: so/runes
BenchmarkBuffruneio-2        200    9395482 ns/op    4198721 B/op    131078 allocs/op
BenchmarkBufio-2            3000     333731 ns/op       4208 B/op         2 allocs/op
PASS
ok      so/runes    3.878s
$