切片预分配比make慢

Extracted from: https://github.com/golang/go/wiki/Performance

A special case of allocation combining is slice array preallocation. If you know a typical size of the slice, you can preallocate a backing array for it as follows:

type X struct {
    buf      []byte
    bufArray [16]byte // Buf usually does not grow beyond 16 bytes. }

func MakeX() *X {
    x := &X{}
    // Preinitialize buf with the backing array.
    x.buf = x.bufArray[:0]
    return x }

Testing the difference:

package main

import "testing"

type buf struct {
        b []byte
}
type buf2 struct {
        b  []byte
        bt [2]byte
}

func B() *buf2 {
        b := &buf2{}
        b.b = b.bt[:0]
        return b
}

func BenchmarkMake(b *testing.B) {
        for i := 0; i < b.N; i++ {
                b := &buf{}
                b.b = make([]byte, 2, 2)
        }
}

func BenchmarkPreallocation(b *testing.B) {
        for i := 0; i < b.N; i++ {
                b := B()
                _ = b
        }
}

Output:

[user@SYSTEM temp]$ go test -bench . -benchmem
goos: linux
goarch: amd64
pkg: temp
BenchmarkMake-8                 100000000               12.7 ns/op             2 B/op          1 allocs/op
BenchmarkPreallocation-8        50000000                40.4 ns/op            32 B/op          1 allocs/op

Why preallocation is slow?

Why does the slice internals (pointing operation) [:0] use 32 bytes to do that?