A cache-line visualizer

An interactive that lets you feel cache-line behavior: array-of-structs vs struct-of-arrays, false sharing in action, 64-byte alignment, the padding trick — the visceral version of things every systems engineer has memorized.

date: 2026-04-11
author: Jonathan
read: 3 min
stack: cache · cpu · memory

methodology reproducible

hardware: TBD — x86_64 reference (64-byte cache lines); Apple M-series addendum for 128-byte L2 lines
kernel: Linux 6.18 LTS for the benchmark sidecar; visualization itself is browser-only
compiler: clang 19, -O3 -march=native for any inline microbenchmark; visualizer is vanilla JS + SVG
dataset: synthetic access patterns over {16B, 64B, 4KB} stride arrays; not a corpus benchmark

TL;DR

Every systems engineer “knows” what a cache line is. Most have it as a memorized fact — 64 bytes on x86, false sharing is bad, alignas(64) helps. This post replaces the memorized fact with a visceral one. The Phase 1 version ships a static SVG walkthrough; the Phase 3 full interactive adds drag-and-drop layouts and live MESI animation.

Methodology

Field	Value
CPU reference	TBD — x86_64 baseline (64B lines); Apple M-series addendum (128B L2)
Kernel	Linux 6.18 LTS for any sidecar benchmark
Compiler	clang 19, `-O3 -march=native` for the optional inline microbenchmark
Dataset	synthetic stride-access arrays; not a corpus benchmark
Scope	Phase 1 = static SVG; Phase 3 = full interactive (SPEC §4.4 Tier 4)
Repro	`lowlat-ms/bench-widgets/cache-line-visualizer`

The question

Can an interactive convert memorized cache-line facts into visceral intuition?
If yes, this becomes a long-tail traffic source forever — the canonical page on the topic.
This post is unusual for lowlat.ms: it’s a visualization-first artifact, not a benchmark post.

Introduction

Framing: the reader already knows what a cache is; we’re skipping the explainer.
What this post is not: not a MESI deep-dive, not a “cache for beginners” piece.
The interactive sits at the top of the page — reader plays before reading.
Credit where due: Ciechanowski-style explorable explanations, Lemire’s false-sharing posts, Agner Fog’s manuals.

Setup

Phase 1 ships a static SVG walkthrough with labeled diagrams — no JS required.
Phase 3 full version: vanilla JS + Canvas, optional Rust→WASM for live measurement.
Hosted on bench.lowlat.ms/cache-line-visualizer to establish the bench subdomain pattern.
prefers-reduced-motion respected; static fallback is always visible.

Baseline

Diagram 1: the 64-byte cache line as a unit of transfer (not a unit of read).
Diagram 2: array-of-structs layout and which lines each field hits.
Diagram 3: struct-of-arrays layout and how the same workload touches fewer lines.
Diagram 4: two threads, two variables, one line — the minimum false-sharing example.

Optimizations

alignas(64) on hot structs to force a fresh line.
The padding trick: pad a hot counter to a full cache line to protect it from neighbors.
SoA + vectorization: how SIMD loads want contiguous same-type data.
Optional inline microbenchmark: measure stride-access throughput live in the reader’s browser (Phase 1 stretch goal).

Results

Not a measurement post. The “results” are the diagrams themselves and the reader’s shift in intuition.
Phase 3 will add a live hotness counter that turns reader interaction into a visible cache-miss heatmap.
Success metric: does this become the page people link when teaching cache effects?

Limitations

Static SVG (Phase 1) cannot show MESI state transitions animated — deferred to Phase 3.
The visualization is x86_64-first; ARM 128B L2 lines and POWER variations are footnotes.
Not a full MESI/MOESI simulator; it’s a teaching tool, not a microarchitectural reference.
Does not cover store forwarding, memory ordering, or cache-coherency protocols in depth.

Reproducibility

All diagrams are checked-in SVG source — reproducible at build time.
Phase 3 interactive will ship its source in lowlat-ms/bench-widgets/cache-line-visualizer.
Any inline microbenchmark has a justfile runner so the reader can cross-check on their own hardware.

References

Ulrich Drepper, What Every Programmer Should Know About Memory — https://people.freebsd.org/~lstewart/articles/cpumemory.pdf
Agner Fog, Optimization manuals — https://www.agner.org/optimize/
Intel 64 and IA-32 Architectures Optimization Reference Manual — https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html
Daniel Lemire on false sharing — https://lemire.me/blog/2023/09/04/locality-and-cache-misses/
Bartosz Ciechanowski, Internal Combustion Engine (style reference for explorable explanations) — https://ciechanow.ski/internal-combustion-engine/
Mechanical Sympathy mailing list archives — https://groups.google.com/g/mechanical-sympathy