I wrote a quick and dirty test to check the performance of Go vs C# in the area of concurrent lookup access and was surprised by the results.
It's a very trivial example and I'm no Go expert but the test is simply to perform 1,000,000 lock/check/add/unlock operations on a map, it's only single-threaded because I'm checking just these functions:
package main
import (
"fmt"
"sync"
"time"
)
var mu sync.Mutex
func main() {
cache := make(map[int]int, 1000000)
start := time.Now()
for i := 0; i < 1000000; i++ {
mu.Lock()
if _, ok := cache[i]; ok == false {
cache[i] = i
}
mu.Unlock()
}
end := time.Since(start)
fmt.Println(end)
var sum int64
for _, v := range cache {
sum += int64(v)
}
fmt.Println(sum)
}
And the same thing in C# (via LINQPad):
void Main()
{
var cache = new Dictionary<int, int>(1000000);
var sw = Stopwatch.StartNew();
for (var i = 0; i < 1000000; i++)
{
lock (cache)
{
int d;
if (cache.TryGetValue(i, out d) == false)
{
cache.Add(i, i);
}
}
}
$"{sw.ElapsedMilliseconds:N0}ms".Dump();
var sum = 0L;
foreach (var kvp in cache)
{
sum += kvp.Value;
}
sum.Dump();
}
I sum the elements of both collections to ensure they match up (499,999,500,000) and print the time taken. Here are the results:
I've checked that it's not possible to initialise the size of a map, just the capacity, so I'm wondering if there's anything I could do to improve the performance of the Go map?
It takes Go 32ms to perform 1,000,000 lock/unlock operations without the map access.
[S]o I'm wondering if there's anything I could do to improve the performance of the Go map?
No there is not. Go has basically no performance knobs.
(Note that Go's map
type is a very general and robust hash map which uses strong cryptographic hashing (if possible) to prevent attacks and forces random key/iteration order. It is "totally general purpose" and not just "a fast dictionary".)
Just to be totally correct: There is the environmental variable GOGC
to "tune" GC.
I compiled your C# example using Mono, and ran it on OS X, just to neutralize any "magic" Microsoft might have added to its Windows implementation of Dictionary.
It appears that C# is indeed faster than Go for this particular test, unless there is some Go performance trick we are overlooking:
dict.cs
using System;
using System.Collections.Generic;
using System.Diagnostics;
public class DictionaryTest
{
public static void Main()
{
var cache = new Dictionary<int, int>(1000000);
var sw = Stopwatch.StartNew();
for (var i = 0; i < 1000000; i++)
{
lock (cache)
{
int d;
if (cache.TryGetValue(i, out d) == false)
{
cache.Add(i, i);
}
}
}
sw.Stop();
Console.WriteLine(string.Format("{0}ms", sw.ElapsedMilliseconds));
var sum = 0L;
foreach (var kvp in cache)
{
sum += kvp.Value;
}
Console.WriteLine("Sum: " + sum);
}
}
If you have the Mono SDK installed, you can compile the above with mcs dict.cs
and execute with mono dict.exe
.
I ran it several times, and it takes an average of 47ms compared to my average 149ms for the Go version.
There may be one thing that is overlooked and converts the whole exercise into apples and oranges: synchronization. On Go side you use Mutex which goes down to the kernel on every access. On C# side you use lock(){} that uses a combination of SpinLock and only falls back to kernel calls when needed. Since your tests are performed in a single thread anyways, C# never even goes to kernel.
Use of Mutex is discouraged in Go and channels should be used for synchronization instead.
Couple of suggestions: 1. Remove synchronization if you want to benchmark map/dictionary by themselves. 2. Write your tests using correct constructs and paradigms if you like to benchmark concurrent performance.
Cheers!
I found if I shink 1000000 to 100000, golang speed would change from 151.0087ms to 10.0005ms (15.1 multiply), while csharp version change from 65ms to 9ms (7.22 multiply) , so it means golang's hashmap has difficulty to handle large map ?
I wrote a simple go benchmark program like this
func BenchmarkIntMapGet100(b *testing.B) {
count := 100
setupIntMap(b, count)
b.ResetTimer()
for i:=0; i<b.N; i++{
_, _ = intMap[i%count]
}
}
and I got the result
BenchmarkIntMapGet10-4 100000000 15.6 ns/op
BenchmarkIntMapGet100-4 100000000 17.1 ns/op
BenchmarkIntMapGet1000-4 50000000 25.7 ns/op
BenchmarkIntMapGet10000-4 50000000 32.3 ns/op
BenchmarkIntMapGet100000-4 30000000 39.2 ns/op
BenchmarkIntMapGet1000000-4 20000000 67.2 ns/op
BenchmarkIntMapGet10000000-4 20000000 82.3 ns/op