I've been working on a vpn written in go and I'm starting to try to optimize the data flow. From a cursory glance, the implementation code seems sound as there are no issues with memory leaking and CPU doesn't seem to be a constraint.
So I moved to pprof and the problem I am seeing is that most of the execution time is spent in syscall.Syscall. I did a 6 second profile of a running iperf throughput test and this is what I see:
This test is being run with both the client and server inside of docker containers with the client getting a --link to the server. Running iperf on the base bridge networking yields around 40Gbit of throughput, iperf over this vpn impl over the top of the same, nets about 500Mbit.
A simple htop shows that 3/4 of the time is spent in the system.
I've tried a couple approaches to attempt speeding up the single-client case, but I can't seem to find a way to mitigate writing packets in a vpn server... NB: iperf uses full MTU-sized packets during its test which limits some obvious optimizations.
listing Syscall:
Not sure why this is showing the CMPQ is taking all the time, I'd think that should be attributed to SYSCALL.
pprof
is a process sampling profiler. It finds that the Program Counter (PC) is often waiting for CMPQ
to execute while the OS is executing.
Speeding up syscall in Go
You can make the SYSCALL
less often. You can improve the OS SYSCALL
mechanism. You can improve the OS code that you asked the SYSCALL
to execute. You can use better hardware. And so on.