Cache Coherence

MESI, snooping, directories — how multicore CPUs keep caches consistent without losing their minds.

cachecoherencemesimulticore

The Problem

When two cores each cache the same memory line and one writes to it, what does the other core see? Without a coherence protocol the answer is “stale data” — and that means bugs.

MESI Protocol

Most x86 processors use a variant of MESI (Modified, Exclusive, Shared, Invalid):

  • Modified — this cache has the only valid copy; it is dirty.
  • Exclusive — only copy, but clean.
  • Shared — multiple caches may hold this line; all clean.
  • Invalid — line is not usable.

State transitions happen via bus snooping or a directory-based scheme on larger core counts.

False Sharing

Two variables on the same cache line updated by different cores cause the line to bounce between Modified and Invalid states — massive performance hit with zero logical sharing.

struct counters {
    alignas(64) long core0_count;
    alignas(64) long core1_count;
};

Padding to a cache-line boundary (alignas(64)) eliminates false sharing.

Directory-Based Coherence

Snooping doesn’t scale beyond ~8–16 cores because every transaction is broadcast. Directory protocols track sharers in a central (or distributed) directory and send targeted invalidations. AMD’s Infinity Fabric and Intel’s mesh interconnect both use directory-based approaches.

Takeaway: False sharing is a silent killer. Always check perf c2c on multi-threaded hot paths.