At what request rate, batch size, and connection count does io_uring beat epoll for a real network read workload on Linux 6.18 LTS — and does it hold across kernels, NUMA topologies, and SQPOLL modes?
When you serve queries against an HNSW index that exceeds your buffer cache, where does the cache pressure actually fall? Which levels miss, how often, and which graph traversal patterns are pathological?