优化离开堆分配

When I am talking about Go, I am speaking about the gc compiler implementation.

As far as I know, Go performs escape analysis. The following idiom is seen pretty often in Go code:

func NewFoo() *Foo

Escape analysis would notice that Foo escapes NewFoo and allocate Foo on the heap.

This function could also be written as:

func NewFoo(f *Foo)

and would be used like

var f Foo
NewFoo(&f)

In this case, as long as f doesn't escape anywhere else, f could be allocated on the stack.

Now to my actual question.

Would it be possible for the compiler to optimize every foo() *Foo into foo(f *Foo), even possibly over multiple levels where Foo is returned in each?

If not, in what kind of cases does this approach fail?

Thank you in advance.

(Not quite an answer but too big for a comment.)

From the comments it seems you might be interested in this small example:

package main

type Foo struct {
    i, j, k int
}

func NewFoo() *Foo {
    return &Foo{i: 42}
}

func F1() {
    f := NewFoo()
    f.i++
}

func main() {
    F1()
}

On Go1.5 running go build -gcflags="-m" gives:

./new.go:7: can inline NewFoo
./new.go:11: can inline F1
./new.go:12: inlining call to NewFoo
./new.go:16: can inline main
./new.go:17: inlining call to F1
./new.go:17: inlining call to NewFoo
./new.go:8: &Foo literal escapes to heap
./new.go:12: F1 &Foo literal does not escape
./new.go:17: main &Foo literal does not escape

So it inlines NewFoo into F1 into main (and says that it could further inline main if someone was to call it). Although it does say that in NewFoo itself &Foo escapes, it does not escape when inlined.

The output from go build -gcflags="-m -S" confirms this with a main initializing the object on the stack and not doing any function calls.

Of course this is a very simple example and any complications (e.g. calling fmt.Print with f) could easily cause it to escape. In general, you shouldn't worry about this too much unless profiling has told you that you have a problem area(s) and you are trying to optimize a specific section of code. Idiomatic and readable code should trump optimization.

Also note that using go test -bench -benchmem (or preferably using testing.B's ReportAllocs method) can report on allocations of benchmarked code which can help identify something doing more allocations than expected/desired.

After doing some more research I found what I was looking for.

What I was describing is apparently called "Return value optimization" and is well doable, which pretty much answers my question about whether this was possible in Go as well.

Further information about this can be found here: What are copy elision and return value optimization?