I would expect that doing a range iteration over the elements of an array would not carry any runtime overhead, but it appears to be 8x slower than raw array access:
func BenchmarkSumRange(b *testing.B) {
nums := [5]int{0, 1, 2, 3, 4}
for n := 0; n < b.N; n++ {
sum := 0
for i, _ := range nums {
sum += nums[i]
}
}
}
func BenchmarkSumManual(b *testing.B) {
nums := [5]int{0, 1, 2, 3, 4}
for n := 0; n < b.N; n++ {
sum := 0
sum += nums[0]
sum += nums[1]
sum += nums[2]
sum += nums[3]
sum += nums[4]
}
}
Benchmark output:
BenchmarkSumRange-8 1000000000 2.18 ns/op
BenchmarkSumManual-8 2000000000 0.28 ns/op
This might make sense if it were a slice whose length were not known at compile-time rather than an array, in which case the runtime code would have to involve a loop with bounds checks. But in the case of an array whose size is known at compile-time, the compiler could just swap out the range iteration for manual access given that the overhead is substantial.
Note: I also tried the more idiomatic range loop over elements:
sum := 0
for _, el := range nums {
sum += el
}
This is even slower (4 ns/op).
A side question: is this overhead present in other languages like Rust? It seems to be a violation of zero-cost abstraction, and fairly annoying in performance-sensitive contexts if there is no fast alternative to writing out the array accesses manually.
First, observe what actually happens in a for
loop:
for i := range sums {
// your code goes here
}
On every iteration, you are incrementing i, which is clearly an overhead.
Why does the compiler not replace it with raw accesses for each iteration you may ask? This would be completely unreasonable, as your binary size would increase drastically.
Consider looping over a normal range. It stores the value in another variable and then accesses it elsewhere.
Actually go's for
loops are the fastest of many languages, I'm not sure about the exact reason for it but you can get more information in this post.
I have checked the for
loop performance in several other languages like java, python and rust, and all of them were slower than go's implementation.