madvise

Linux syscall that hints the kernel about a memory region's future access pattern (sequential, random, will-need, don't-need, hugepage, etc.); the main tuning dial for mmap-backed workloads.

also known as madvise(2) · MADV_HUGEPAGE

stack syscall · memory

madvise is a Linux syscall that lets a process give the kernel hints about its intended use of a virtual memory range. The hints don’t change correctness — they tune the kernel’s readahead, huge-page, and reclaim behavior for that region. It’s the primary tuning dial for any workload that uses mmap heavily (databases, in-memory indexes, memory-mapped files).

Common advice values:

  • MADV_SEQUENTIAL — expect sequential access; aggressive readahead.
  • MADV_RANDOM — expect random access; minimal readahead.
  • MADV_WILLNEED — pre-fault pages now; warm the cache.
  • MADV_DONTNEED — I’m done with this range; the kernel may reclaim it (and anonymous pages will be zeroed on next access).
  • MADV_HUGEPAGE — try to back this range with transparent huge pages (when THP is set to madvise).
  • MADV_NOHUGEPAGE — opt out of THP.
  • MADV_POPULATE_READ / MADV_POPULATE_WRITE — pre-fault pages without the side effects of mlock.
  • MADV_FREE — lazy free; kernel may reclaim, but unlike DONTNEED reads return the old contents until reclaimed.

The common performance story: latency-sensitive workloads that touch large mmap regions usually want MADV_RANDOM + MADV_WILLNEED on hot regions to avoid per-page fault latency, plus MADV_HUGEPAGE on stable large regions to reduce TLB pressure. MADV_DONTNEED is the right way to return memory to the kernel without munmap/mmap churn.

sources