syscall

A controlled transition from user-mode code into the kernel to request a privileged service (I/O, memory, scheduling); the per-crossing cost is the floor below which many performance tricks aim to push.

also known as system-call · sys_enter

stack syscall · kernel

A syscall (system call) is the mechanism by which a user-space program requests a privileged service from the operating system kernel — opening a file, sending bytes over a socket, allocating memory, creating a thread. On x86_64 Linux it’s implemented via the syscall instruction, which transitions the CPU into kernel mode, saves user-mode state, and branches to a kernel entry point that dispatches on the syscall number in %rax.

The performance-relevant fact is that a syscall is not just an expensive function call. It flushes speculative state around the mode transition, invokes KPTI page-table switches on mitigations-on kernels, and touches kernel stacks and per-CPU state. Measured cost on modern x86 ranges from tens to hundreds of nanoseconds depending on kernel version and mitigations.

That per-crossing cost is the floor a lot of low-latency work is trying to push down or amortize. vDSO maps read-only kernel data (clocks, gettimeofday) into user-space so simple queries avoid the transition entirely. io_uring amortizes many I/O ops across a single io_uring_enter — or zero, with SQPOLL. Batching recv/send via recvmmsg/sendmmsg reduces the syscall count for network work.

For benchmarks, always record syscall count per operation (perf stat -e raw_syscalls:sys_enter) — it explains a surprising amount of the gap between two implementations of the same logic.

sources