I'm writing a arbitrary JSON parser for golang, the project nearly finished. But I found a confusing problem about the performance:
I want to test the performance about parse a big(100Mb) JSON string, I use the test file itself to init the JSON struct in memory and write the marshaled JSON string to a file, and then read from it, if the file exists already, will not init it in memory, just read from file directly. The performance is totally different: cost about double time to parse when read from file directly. At the same time, I'm testing the performance about parse normal(1Kb) JSON string and deep(2Mb) JSON string, both of these are almost unaffected.
Why? is CPU cache? or GC? or any else?
The code at https://github.com/acrazing/cheapjson, and I opened a issue about this problem at https://github.com/acrazing/cheapjson/issues/1. You can get more detailed information about the benchmark from here.
I am unable to reproduce your issue: https://github.com/acrazing/cheapjson/issues/1. My benchmark results:
$ go version
go version devel +b817359 Sat Jul 22 01:29:58 2017 +0000 linux/amd64
$ rm -rf data
$ ls -la data
ls: cannot access 'data': No such file or directory
$ go test -v -run=! -bench=. -benchmem parser_test.go
2017/07/23 02:42:09 big input size: 117265882, normal input size: 763, deep input size: 2134832
goos: linux
goarch: amd64
BenchmarkUnmarshalBigInput-4 1 1107362634 ns/op 278192256 B/op 8280859 allocs/op
BenchmarkSimpleJsonBigInput-4 1 2680878194 ns/op 569595680 B/op 7575252 allocs/op
BenchmarkUnmarshalNormalInput-4 300000 5685 ns/op 4622 B/op 129 allocs/op
BenchmarkSimpleJsonNormalInput-4 200000 9565 ns/op 5602 B/op 87 allocs/op
BenchmarkUnmarshalDeepInput-4 1000 2085186 ns/op 372922 B/op 5134 allocs/op
BenchmarkSimpleJsonDeepInput-4 100 12311435 ns/op 8911102 B/op 6117 allocs/op
PASS
ok command-line-arguments 15.067s
$ ls -la datatotal 116620
drwxr-xr-x 2 peter peter 4096 Jul 23 02:42 .
drwxr-xr-x 5 peter peter 4096 Jul 23 02:42 ..
-rwxr-xr-x 1 peter peter 117265882 Jul 23 02:42 big.json
-rwxr-xr-x 1 peter peter 2134832 Jul 23 02:42 deep.json
-rwxr-xr-x 1 peter peter 763 Jul 23 02:42 normal.json
$ go test -v -run=! -bench=. -benchmem parser_test.go
2017/07/23 02:42:31 big input size: 117265882, normal input size: 763, deep input size: 2134832
goos: linux
goarch: amd64
BenchmarkUnmarshalBigInput-4 1 1140498937 ns/op 278220704 B/op 8280995 allocs/op
BenchmarkSimpleJsonBigInput-4 1 2685285322 ns/op 569592608 B/op 7575242 allocs/op
BenchmarkUnmarshalNormalInput-4 300000 5685 ns/op 4622 B/op 129 allocs/op
BenchmarkSimpleJsonNormalInput-4 200000 9633 ns/op 5601 B/op 87 allocs/op
BenchmarkUnmarshalDeepInput-4 1000 2086891 ns/op 372927 B/op 5134 allocs/op
BenchmarkSimpleJsonDeepInput-4 100 12387413 ns/op 8911084 B/op 6117 allocs/op
PASS
ok command-line-arguments 11.903s
$ ls -la data
total 116624
drwxr-xr-x 2 peter peter 4096 Jul 23 02:42 .
drwxr-xr-x 5 peter peter 4096 Jul 23 02:42 ..
-rwxr-xr-x 1 peter peter 117265882 Jul 23 02:42 big.json
-rwxr-xr-x 1 peter peter 2134832 Jul 23 02:42 deep.json
-rwxr-xr-x 1 peter peter 763 Jul 23 02:42 normal.json
$
I got similar results on four different machines and two Operating systems: three Linux and one Windows.