在Go中加快syscall

I've been working on a vpn written in go and I'm starting to try to optimize the data flow. From a cursory glance, the implementation code seems sound as there are no issues with memory leaking and CPU doesn't seem to be a constraint.

So I moved to pprof and the problem I am seeing is that most of the execution time is spent in syscall.Syscall. I did a 6 second profile of a running iperf throughput test and this is what I see:

enter image description here

This test is being run with both the client and server inside of docker containers with the client getting a --link to the server. Running iperf on the base bridge networking yields around 40Gbit of throughput, iperf over this vpn impl over the top of the same, nets about 500Mbit.

A simple htop shows that 3/4 of the time is spent in the system.

I've tried a couple approaches to attempt speeding up the single-client case, but I can't seem to find a way to mitigate writing packets in a vpn server... NB: iperf uses full MTU-sized packets during its test which limits some obvious optimizations.

listing Syscall:

enter image description here

Not sure why this is showing the CMPQ is taking all the time, I'd think that should be attributed to SYSCALL.

pprof is a process sampling profiler. It finds that the Program Counter (PC) is often waiting for CMPQ to execute while the OS is executing.

Speeding up syscall in Go

You can make the SYSCALL less often. You can improve the OS SYSCALL mechanism. You can improve the OS code that you asked the SYSCALL to execute. You can use better hardware. And so on.