copy-on-write

Memory-management technique that shares a read-only mapping until a write triggers a private copy; the reason fork() is cheap and why a mis-tuned workload can take a latency spike on the first store.

also known as cow · COW

stack kernel · memory

Copy-on-write (CoW) is a memory-management technique where two virtual mappings share the same physical pages read-only until one of them writes to a page, at which point the kernel allocates a fresh physical page, copies the original contents, and rebinds the writing mapping to the new page. The other mapping still sees the old data.

Linux’s fork() is the canonical example: instead of duplicating the child’s address space eagerly, the kernel marks every shared page read-only in both parent and child and defers the copy to first write. This is why forking a 40 GB process takes milliseconds, not minutes. The same mechanism backs mmap(MAP_PRIVATE), which maps a file with writes redirected to private anonymous pages on first store.

The low-latency gotcha: the first write to a shared CoW page takes a page fault. The kernel has to allocate a new physical page (possibly reclaiming one), copy the old contents, update the PTE, and resume. That’s tens of microseconds on a typical box — a very visible tail latency spike if it happens on a hot path. Mitigations include MADV_POPULATE_WRITE (pre-fault write-capable mappings), pre-touching pages after fork, or avoiding fork in latency-critical sections altogether.

Related concepts: lazy allocation (the other reason first-touch on a fresh mmap region faults), huge-page splitting (CoW on a 2 MB page copies the whole 2 MB), and madvise(MADV_MERGEABLE) for KSM-backed de-duplication.

related

sources