io_uring
Linux async I/O interface (kernel 5.1+) built on two shared-memory ring buffers that batches syscalls and amortizes kernel crossings for low-latency file, network, and storage work.
io_uring is Linux’s unified asynchronous I/O interface, introduced in kernel 5.1 (May 2019) by Jens Axboe. It uses two lock-free shared-memory ring buffers — a submission queue (SQ) and a completion queue (CQ) — between userspace and the kernel. Userspace fills the SQ with operations (read, write, recv, send, accept, openat, fsync, etc.), submits via io_uring_enter, and later reaps completions from the CQ without blocking.
The key design win is that syscall overhead gets amortized across many operations, and some modes eliminate the syscall entirely. With SQPOLL, a kernel thread polls the SQ so userspace never makes the submission syscall. With registered fixed buffers and registered files, per-op pointer translation disappears. Multishot receives (6.0+) let a single SQE produce many completions, amortizing setup across incoming messages.
For low-latency workloads, io_uring is the de facto Linux answer for “async I/O that isn’t DPDK”. It is faster than epoll in ping-pong and batched workloads, comparable or slower at very low queue depth, and wins on tail latency when SQPOLL + registered buffers are used. It is broader than networking: file ops, fsync, statx, splice, and more all flow through the same interface.
Misconception: io_uring is not always faster than epoll. The crossover depends on batch size, workload shape, and feature availability in the running kernel.