在对Go代码进行性能分析时,如何在结果中获得更大的粒度?

I'm currently learning Go and i've written a brainfuck parser as a learning exercise. When run it's a little slow and i'm guessing it's the standard library print functions being slow. When i come to profile the program i get a very small report from the pprof tool.

I've basically used all techniques described here to create a cpu.profile: http://saml.rilspace.org/profiling-and-creating-call-graphs-for-go-programs-with-go-tool-pprof

I build the project and run it:

$ go build bfg
$ ./bfg mandelbrot.bf

This outputs the cpu.profile. I then run pprof against that profile:

$ go tool pprof --text bfg cpu.profile > report.txt

Here are the contents of report.txt:

42.56s of 42.56s total (  100%)
  flat  flat%   sum%        cum   cum%
42.56s   100%   100%     42.56s   100%  main.main
     0     0%   100%     42.56s   100%  runtime.goexit

Small isn't it? This is pretty disappointing as i was expecting to see standard library and runtime calls and there too and in much more depth.

I tried also generating a callgraph:

$ go tool pprof --pdf bfg cpu.profile > callgraph.pdf

The contents of callgraph.pdf:

enter image description here

What am i doing wrong? I wanna see standard library and runtime calls too.

EDIT: I'm using Ubuntu 14.04 and Go 1.4.1