branch predictor
CPU hardware that speculates the outcome of upcoming branches so the pipeline can keep fetching instructions without waiting for the condition; mispredictions cost a full pipeline flush.
A branch predictor is a hardware unit that guesses the outcome of conditional branches (and the target of indirect branches) before they resolve, so the CPU’s front-end can keep fetching and decoding instructions down the predicted path. A correct prediction is invisible. A misprediction means the CPU flushes all the speculatively-issued work on the wrong path and restarts from the correct target — a penalty of ~15–20 cycles on modern x86, often more.
Modern predictors are sophisticated: TAGE-style predictors use many history lengths and tags to pick the best one per branch, indirect-branch predictors handle virtual calls and jump tables, and return-stack buffers handle function returns. Under favorable conditions, modern CPUs predict well above 95% of branches correctly.
For low-latency code:
- Branchy hot paths (many data-dependent conditionals) lose predictor state to aliasing and context switches.
- Branchless techniques (conditional moves, bitmask selects, SIMD blends) replace hard-to-predict branches with straight-line code. Worth it when the branch is unpredictable (~50/50).
__builtin_expect/[[likely]] / [[unlikely]]hint the compiler about which path is hot, influencing basic block ordering so the hot path is fall-through.perf statbranch counters (branch-misses,branches) quantify the hit rate.
Misprediction storms are a known source of tail latency. Rare decisions on the hot path (error paths, degenerate inputs) cost disproportionately when they happen.