如何有效构造golang程序以实现最佳垃圾收集器运行?

Optimizing code for better results in the golang GC seems to be more of a rather important thing recently with the strongly time-optimized GC runs. I was recently told how much it accomplishes in a run "depends on your pattern of heap memory usage.", but I'm not really sure exactly what that means/entails from the perspective of a programmer in the language. Or is that not something that can easily be controlled?

I have read through the recent book "The Go Programming Language" by Brian W. Kernighan, but there is nothing about this topic in it. And all information on this topic on the internet are from years ago, so don't really apply.

Some things I currently do include:

  • Making sure pointers/objects are only ever stored/remembered where they need to be
  • Allocating objects with capacities of what are expected or are sane
  • Not duplicating data
  • When able, using streaming data through functions instead of putting all data into a big heap up front.

I am also a bit annoyed by the fact that strings and byte arrays are always recreated when converting between one or the other (due to strings being immutable). So when I am going from one to the other, and its a safe operation, I just recast their pointers to the other type using unsafe.

Are all of these practices worth it to help the GC run faster and clear more? Is there anything else I could do?

If it were a simple matter of a list of do's and don'ts we could simply write a program to optimize memory usage.

The first step is to write correct, well-designed, maintainable, and easy-to-read code.

Next, using Go's testing package, benchmark critical functions. For example. a real case,

BenchmarkOriginal      30000     44349 ns/op       52792 B/op      569 allocs/op

Use Go's profile tool. Read the source and executable code to see what is going on.

Implement strategies, such a single underlying array and full slice expressions, to reduce GC memory allocations and CPU time. Run a final benchmark.

BenchmarkOptimized    100000     13198 ns/op       32992 B/op        3 allocs/op

In this case, 569 allocations of the elements of a triangular array were reduced to 3 allocations, a 99% reduction in allocations with a corresponding 70% reduction in CPU time. There's a lot less for the garbage collector (GC) to do too.

Of course, this made the code harder to read (more complex and more obscure) and thus harder to maintain.

Don't over optimize. Do you really have nothing better to do? The best optimization is often buying a bigger computer.