LRU / LFU Cache · System Internals

The big picture#

TL;DRthe 30-second version

A full fixed-size cache must evict an existing entry to admit a new one. The eviction policy picks the victim — a bet about which cached entry is least likely to be needed again soon.
LRU (Least Recently Used) bets on recency: evict whatever has gone untouched longest. It is a hash map (key → node) plus a doubly-linked list ordered by recency, giving O(1) lookup, reorder, and eviction.
LFU (Least Frequently Used) bets on frequency: evict whatever has been used fewest times. The O(1) form keeps frequency buckets (each an LRU list) plus a minFreq pointer, so the coldest victim is found without scanning.
Neither is universally best. LRU is wrecked by a one-time scan that floods it with cold data; classic LFU clings to yesterday's hot keys forever unless you add frequency aging.
Real systems mostly run cheaper approximations and smarter hybrids: CLOCK (approximate LRU), segmented LRU (Memcached), ARC (ZFS), and W-TinyLFU (Caffeine, the highest-hit-rate general policy in wide use). Redis exposes allkeys-lru / allkeys-lfu / random / volatile-* as a config knob.

Everything below expands on these points. Read the core sections top to bottom for the full mental model; the collapsible "Go deeper" box holds the advanced eviction policies (ARC, LIRS, W-TinyLFU) you can skip on a first pass and return to later.

Capacity = 4, the cache is FULL, and a new key X arrives — someone must be evicted. Recent accesses, oldest → newest: C A B D A. The hash map and value storage are identical for both policies; the only thing that differs is the question each one asks.

Policy	Question it asks	Victim on this stream
LRU	touched longest ago?	C — it has gone untouched the longest
LFU	used fewest times?	counts A:2, B:1, C:1, D:1 → evict the coldest (B/C/D tie, broken by recency)

Start here: the victim-selection problem#

A cache trades memory for speed: keep hot data close by so you don't re-fetch it from a slower source (disk, a database, a network call). Memory is finite, so once the cache is full, every insert needs a victim. The eviction policy is the rule that picks the victim — and a bad rule means cache misses on data you just evicted, which is strictly worse than a smarter rule using the same memory.

There is a known optimal answer, and it is useless in practice. Bélády's algorithm (MIN, or OPT) evicts the entry whose next use is furthest in the future, and provably yields the fewest misses of any policy. The catch: it requires knowing the future. Every real policy is a heuristic that predicts the future from the past — and the two simplest predictions are 'what I used recently I'll use again' (recency → LRU) and 'what I use often I'll keep using' (frequency → LFU).

The reason these heuristics work at all is locality of reference: real access streams are not uniform random. They cluster in time (a key touched now is likely touched again soon — temporal locality) and they are skewed (a small fraction of keys take a large fraction of the accesses — a Zipfian or power-law distribution). A cache only beats random eviction because the workload has structure; the policy is a guess about which kind of structure dominates.

Both policies must be O(1) per requestA cache sits on the hot path — every GET and PUT goes through it, often millions per second. An eviction policy that is O(log n) or worse partly defeats the purpose: the bookkeeping eats the latency the cache was meant to save. That constraint, more than anything, is why LRU and LFU are built from specific pointer-based structures rather than 'just sort by recency or frequency on each access,' which would be O(n log n) per request.

LRU: evict whoever you haven't touched in the longest time#

LRU (Least Recently Used) bets that data accessed recently is likely to be accessed again soon — true for most real workloads, because of temporal locality. The implementation is a doubly-linked list plus a hash map from key to its node: the map gives O(1) lookup, the linked list gives O(1) reordering. The list is kept in recency order — most-recently-used at the head, least-recently-used at the tail — and the tail is always the next victim.

DHEAD · MRU

⇄

BTAIL · LRU = next victim

Doubly-linked list in recency order (a hash map points into it)

GET key: look it up in the map. Miss → done. Hit → unlink its node and relink it at the head (most-recently-used end), then return the value.
PUT key=value, key exists: update the node's value, then move it to the head exactly like a GET hit.
PUT key=value, key is new and the list is below capacity: create a node, link it at the head, add it to the map.
PUT key=value, key is new and the list is at capacity: unlink the tail node (the least-recently-used entry), remove it from the map, then insert the new node at the head and the map.

Why a linked list and not an arrayMoving an accessed entry to the 'most recent' end needs to be O(1). In an array that means an O(n) shift of every element after it; in a doubly-linked list it's an unlink (repoint two neighbors' pointers) plus a relink at the head — both O(1) regardless of list size. The hash map exists purely so 'find the node for this key' is also O(1) instead of an O(n) list scan. Neither structure alone is enough; LRU is the canonical example of combining two data structures to get the best operation from each.

PredictCapacity is 3 and the cache holds [A, B, C] with C the most-recently-used. You GET A, then PUT D. Under LRU, which key gets evicted?

Hint: LRU evicts whichever key has gone untouched the longest.

B. The GET A just made A recently-used, so the least-recently-used entry is now B — it's the victim that gets pushed out to make room for D.

LFU: evict whoever's been touched the fewest times#

LFU (Least Frequently Used) bets on a different signal: total access count, not recency. A key accessed 100 times an hour ago is, by this policy, more valuable than one accessed once a second ago — the opposite call LRU would make. The naive way to find the least-frequent key is to scan all entries for the minimum count, which is O(n). The O(1) construction (Pai et al.'s design) avoids the scan with three structures: a key→value map, a key→frequency map, and a map from frequency→an ordered list of keys at that frequency, plus a single minFreq pointer that always names the lowest occupied bucket.

front = newest at that frequency, back = oldest · minFreq names the lowest occupied bucket

freq 1ED← minFreq · evict from the back: D

freq 2CA

freq 4B

Frequency buckets, each itself an LRU list, with a minFreq pointer

GET key (hit) or PUT on an existing key: look up its current frequency f, remove it from bucket[f], push it to the front of bucket[f+1], and record its new frequency as f+1.
If bucket[f] is now empty and f was minFreq, bump minFreq to f+1 — the old floor frequency no longer has any keys.
PUT a new key when below capacity: insert it at the front of bucket[1], and reset minFreq to 1 — a fresh key is always the new floor.
PUT a new key at capacity: evict the back (least-recently-used, as a tie-break) entry of bucket[minFreq], remove it from all maps, then insert the new key as above.

The tie-break inside a frequency bucket is still LRUFrequency alone doesn't fully order keys — multiple keys can share the same count. Each frequency bucket is itself ordered by recency (front = most recently touched at that frequency), so ties break toward evicting the one that's gone longest without a touch even within the same frequency. LFU is really 'sort by frequency, then by recency' — not frequency alone. Note also that minFreq only ever needs to increase by one (on access) or reset to one (on insert), which is exactly why it can be maintained in O(1) without ever searching for the new minimum.

Cost: time, and the memory tax of pointers#

Both policies hit the design goal: every operation is O(1) on average. The 'on average' is the hash map's amortized cost (constant expected, with rare rehashes); the list and bucket manipulations are worst-case O(1) because they only ever touch a fixed number of pointers. That is the whole point — the per-request cost does not grow as the cache fills, so the cache stays fast at ten entries or ten million.

	LRU	LFU (O(1) form)
GET (hit)	O(1): map lookup + move node to head	O(1): lookup + move node to next freq bucket
PUT (new)	O(1): insert at head, maybe drop tail	O(1): insert in bucket[1], maybe evict from bucket[minFreq]
Eviction	O(1): the tail is always the victim	O(1): back of bucket[minFreq] is always the victim
Per-entry metadata	2 pointers (prev/next) + map entry	2 pointers + frequency counter + bucket membership + map entry

The cost that's easy to forget is memory. A doubly-linked list node carries two pointers — 16 bytes on a 64-bit machine — on top of the key, the value, and the hash-map entry that points at the node. For small values (think a cache of 8-byte integers keyed by 8-byte integers) the bookkeeping can dwarf the data it's tracking. LFU is heavier still: every entry also needs a frequency counter and membership in a per-frequency list, and the frequency→bucket map is extra structure on top. This memory tax is precisely why production systems often choose approximate policies (next section): a few sampled keys or a single reference bit per entry buys most of LRU's hit rate at a fraction of the per-entry overhead.

Cache-friendliness, literallyPointer-chasing through a linked list is hostile to the CPU's own cache: each node can live anywhere in memory, so a hot-path reorder may stall on a cache miss. High-performance caches (and the OS page cache) often prefer array-backed CLOCK or sampling precisely because sequential or random array access plays better with hardware prefetching than chasing 'next' pointers across the heap.

Beyond the textbook two: cheaper and smarter policies#

LRU and LFU are the policies you implement in an interview, but they are rarely what runs in production unmodified. Real systems either approximate LRU to cut its overhead, or hybridize recency and frequency to beat both. The lightweight options:

FIFO (first-in-first-out) — evict the oldest inserted entry, ignoring use entirely. A plain queue, no reordering on access, almost zero metadata. Cheap but blind: it throws out hot data just because it has been resident a while, and it can even exhibit Bélády's anomaly (more cache → more misses).
Random — evict a uniformly random entry. No ordering metadata at all. Surprisingly competitive on skewed workloads and trivially concurrent (no list to lock), which is why it shows up as allkeys-random in Redis and inside some CPU caches.
CLOCK / Second-Chance — an approximate LRU. Entries sit in a circular buffer, each with a one-bit reference flag set on access. A 'hand' sweeps the circle; if it lands on an entry with the bit set, it clears the bit and moves on (a second chance), otherwise it evicts. One bit per entry instead of two pointers, no per-access list surgery — the classic OS page-replacement policy.
Segmented LRU (SLRU) — split the cache into a probationary segment and a protected segment. New entries enter probation; a second hit promotes them to protected. A one-time scan only ever pollutes probation, so the protected working set survives. Memcached's modern LRU is a segmented HOT/WARM/COLD variant of this idea.

Go deeperThe scan-resistant heavyweights: 2Q, ARC, LIRS, W-TinyLFU

2Q (Johnson & Shasha, VLDB 1994) keeps a small FIFO queue A1 for first-time accesses and an LRU queue Am for entries seen more than once. A bulk scan churns only A1; entries that prove their worth with a second access graduate to Am and are protected. It is essentially the simplest scan-resistant design and a direct ancestor of segmented LRU.

ARC (Adaptive Replacement Cache; Megiddo & Modha, USENIX FAST 2003) splits the cache into two real LRU lists — T1 for entries seen once (recency) and T2 for entries seen at least twice (frequency) — and keeps two 'ghost' lists, B1 and B2, holding only the keys of recently evicted entries from T1 and T2. A hit in a ghost list is feedback: a miss that would have been a hit if that half of the cache were bigger. ARC nudges a target size p toward whichever half is getting the ghost hits, so it self-tunes between recency and frequency with no manual knob and stays scan-resistant. It ships in ZFS/OpenZFS; IBM's patent on it is the reason PostgreSQL switched away from a planned ARC to its own clock-sweep.

LIRS (Jiang & Zhang, SIGMETRICS 2002) ranks entries by reuse distance (the number of distinct other entries touched between two accesses to the same entry) rather than plain recency. Entries with short reuse distance are 'hot' and protected; a scan has effectively infinite reuse distance and is evicted quickly. It is strongly scan- and loop-resistant and is used (in approximate form) by systems like MySQL's InnoDB midpoint-insertion buffer pool and ClickHouse.

W-TinyLFU (Einziger, Friedman & Manes, 2017) is the current state of the art for general-purpose caches. It estimates each key's frequency with a Count-Min-Sketch-style counter array — a handful of bytes for the whole cache rather than a counter per entry — and ages all counters by halving them periodically, so old popularity fades (the fix classic LFU lacks). On admission it compares the incoming key's estimated frequency against the eviction candidate's and only admits the newcomer if it is genuinely more frequent. A small LRU 'window' in front absorbs bursts of brand-new keys. This is the policy inside Caffeine, the default cache in Spring and many JVM systems, and it consistently beats LRU, LFU, and ARC on real traces.

The recency bet vs the frequency bet#

LRU and LFU are two answers to the same question — 'which past signal best predicts the future?' — and each is right exactly when its assumption holds. LRU assumes recency dominates: the thing you touched most recently is the thing you'll touch next. That holds for working-set workloads with strong temporal locality (a user's current session, the pages of the file you're scrolling). LFU assumes popularity dominates: a small set of items is durably hot and the rest is noise. That holds for skewed, long-lived distributions (the top trending videos, a handful of celebrity profiles, hot product pages).

The sharpest practical axis between them is scan resistance. A scan — reading a large set of items exactly once — is the adversary that separates the policies. LRU has zero scan resistance: every scanned item looks maximally recent, so the scan marches your real working set out the tail. LFU resists scans naturally, because scanned items only ever reach frequency 1 and are evicted before they can displace genuinely hot keys. This single property is why almost every 'serious' policy (2Q, ARC, LIRS, segmented LRU, W-TinyLFU) blends in a frequency or second-chance signal: pure recency is too easily fooled.

Frequency's blind spot: no sense of timePure LFU has the opposite weakness — it has no notion of 'recently.' A key that earned a huge count last week is, to textbook LFU, permanently more valuable than a key climbing fast today, because raw counts never decrease. Without aging (counter decay, the periodic halving in W-TinyLFU, or Redis's decay period) LFU ossifies around stale winners. The modern consensus — recency and frequency are both necessary, neither sufficient — is exactly what adaptive policies encode.

How each one fails in production#

Knowing the textbook policy is half the job; knowing how it breaks is what an interviewer (and a 3 a.m. page) actually cares about.

LRU cache pollution from a bulk scan — a backup job, an analytics query, or a crawler reads every row once. Each read looks 'most recent,' so LRU promotes all of it and evicts the genuinely hot working set. The moment the scan ends, your hit rate craters because the data people actually want has been flushed. This is the single most common LRU outage and the reason scan resistance matters.
LFU stale-hot keys (no aging) — a product that went viral last month keeps a sky-high count and refuses to leave, squatting on capacity long after traffic moved on. New genuinely-hot keys can't accumulate enough count to displace it. The fix is frequency decay; the bug is shipping textbook LFU without it.
LFU new-key starvation — a brand-new key enters at frequency 1, the lowest possible, so it's first in line for eviction. In a busy cache it can be evicted before it ever gets a second hit, meaning a rising-but-not-yet-popular item never gets a fair chance. (W-TinyLFU's admission window exists specifically to give newcomers a probation period.)
Cold start — right after a restart or deploy the cache is empty, so every request misses and stampedes the backing store. Neither policy helps here; this is about cache warming and request coalescing, not eviction, but it's the failure people most often blame on 'the cache.'

Concurrency is its own failure modeThe clean O(1) LRU has an ugly secret under load: every GET mutates the shared linked list, so a naively-locked LRU serializes all reads behind one lock — the policy becomes the bottleneck. Real systems sidestep this by batching access records in per-thread buffers and replaying them lazily (Caffeine), by using lock-free CLOCK, or by sharding the cache. If an interviewer asks 'how does this scale to many cores,' the honest answer is 'textbook LRU doesn't — you approximate.'

Policies side by side#

Policy	Recency	Frequency	Scan-resistant	Per-entry overhead
FIFO	No (insertion order only)	No	No	Minimal (queue link)
Random	No	No	Partially (by luck)	None
LRU	Yes (primary)	No	No	2 pointers + map
LFU	Tie-break only	Yes (primary)	Yes	Pointers + counter + buckets
CLOCK	Approximate	No	No	1 reference bit
ARC	Yes (adaptive)	Yes (adaptive)	Yes	2 lists + 2 ghost lists
W-TinyLFU	Window only	Yes (aged sketch)	Yes	Sketch (shared, tiny)

Read the table as a progression: the top rows are cheap and dumb, the bottom rows are heavier and smarter. ARC and W-TinyLFU win because they carry both signals at once and adapt the balance; the price is more moving parts. For an interview, being able to place LRU and LFU on this spectrum — and name what the smarter policies add (adaptivity, aging, scan resistance) — is worth more than a flawless re-derivation of either.

Where these policies actually run#

Eviction policy is one of those primitives you've been relying on for years without naming. A tour of where the choices land:

Redis — exposes the policy as the maxmemory-policy config: noeviction, allkeys-lru, allkeys-lfu, allkeys-random, and the volatile-* variants (lru / lfu / random / ttl) that only evict keys with a TTL set. Crucially, Redis's LRU and LFU are both approximate — it samples a handful of keys (default 5) and evicts the best of the sample rather than maintaining a true global order, trading a little accuracy for a lot of memory. Its LFU uses an 8-bit Morris counter (probabilistic increment) plus a configurable decay period so counts age.
Memcached — historically a straightforward LRU; since 1.5 it runs a segmented LRU with HOT, WARM, and COLD sub-LRUs and a background crawler, specifically to get scan resistance and reclaim expired items without walking the whole cache.
Caffeine (Java) — the reference W-TinyLFU implementation, and the default caching library across much of the JVM ecosystem (Spring, Cassandra's key cache, and more). It is the practical proof that an adaptive frequency sketch beats LRU/LFU/ARC on real traces while staying highly concurrent.
The OS page cache — Linux keeps two LRU lists (active and inactive), an approximation of segmented LRU / 2Q, to decide which file-backed pages to reclaim under memory pressure. Older and simpler kernels used CLOCK. The whole 'why is my RAM full of cache' phenomenon is this policy at work.
CPU caches — set-associative L1/L2/L3 caches can't afford true LRU in hardware (too many bits, too much logic per line), so they use pseudo-LRU (a tiny tree of bits approximating LRU) or even random replacement within a set. The same recency intuition, squeezed into a few gates per cache line.

The pattern, once you see itBuffer pools (PostgreSQL's clock-sweep, MySQL InnoDB's midpoint-insertion LRU), CDN edge caches, browser caches, DNS resolver caches, JIT code caches — all of them are a fixed-size map plus an eviction policy. Recognize 'fixed capacity + a victim rule' and you've recognized the same structure in a dozen systems that never use the word 'cache' in their docs.

Common questions & gotchas#

What does O(1) mean here, and why insist on it?

O(1) ('constant time') means the work per operation is the same whether the cache holds ten items or ten million. Every request goes through the cache, so anything slower — like re-sorting entries, which is O(n log n) — would make the cache itself the bottleneck it was meant to remove. The whole art of the LRU and LFU data structures is hitting O(1) for lookup, reorder, and eviction simultaneously.

Why a doubly-linked list instead of a normal array?

Moving a just-touched item to the 'most recent' end has to be cheap. In an array that's an O(n) shift of everything after it; in a doubly-linked list you only repoint a couple of neighbor pointers — O(1), no matter how big the list is. It must be doubly linked (not singly) so that, given a node, you can splice it out in O(1) without walking the list to find its predecessor.

Isn't LFU always better than LRU, since it counts real usage?

No. Textbook LFU clings to keys that were popular long ago even after they go cold (raw counts never decrease without aging), and it starves brand-new keys that enter at frequency 1. LRU adapts instantly to recent behavior but a one-time bulk scan can wipe its whole working set. Neither wins everywhere — which is why production systems use adaptive hybrids like ARC and W-TinyLFU.

How does LFU break a tie between two keys with the same count?

By recency. Each frequency bucket is itself an LRU list, so among keys with the same frequency the one untouched longest is evicted first. LFU is really 'order by frequency, then by recency' — pure frequency doesn't fully order the keys.

QuizA nightly backup job reads every row in the table exactly once. Your service uses an LRU cache in front of that table. What happens to the cache's hit rate during and right after the backup, and which policy idea fixes it?

Hit rate is unaffected; LRU handles scans well
Hit rate collapses — the scan evicts the hot working set; scan-resistant policies (LFU/2Q/ARC) fix it
Hit rate improves because the cache is now full of fresh data
Only LFU caches are affected by scans, not LRU

Show answer

Hit rate collapses — the scan evicts the hot working set; scan-resistant policies (LFU/2Q/ARC) fix it — Every row the backup reads looks 'most recently used,' so LRU promotes the entire scanned table and evicts the genuinely hot keys out the tail — hit rate craters during the scan and stays low until the working set is re-warmed. This is classic LRU cache pollution. Frequency- or admission-aware policies (LFU, 2Q, segmented LRU, ARC, W-TinyLFU) resist it because a once-read row never accumulates enough frequency to displace truly hot data.

In an interview#

Lead with the O(1) constraint — it's what forces the specific data structures. For LRU: hash map + doubly-linked list, full stop. For LFU: hash map for value, hash map for frequency, and a frequency→keys structure (each bucket an LRU list) with a minFreq pointer so you never have to scan for the minimum. If you can explain why minFreq only ever increments or resets to one, you've shown you understand why the whole thing is O(1).

Be ready to justify the choice with the workload, not in the abstract: LRU for general-purpose caching where recent access predicts future access (most HTTP/DB caches, session data); LFU when some keys are durably hotter than others regardless of when they were last touched (a small set of celebrity profiles, trending content). Then name the failure modes unprompted — LRU's scan pollution, LFU's stale-hot keys without aging — because naming the weakness is what separates 'memorized the data structure' from 'understands caching.'

If the conversation goes deeper, drop one real-world anchor: Redis approximates both with sampling; Memcached uses segmented LRU; ZFS uses ARC; Caffeine uses W-TinyLFU — adaptive policies that carry recency and frequency at once. You don't need to implement them, just to know they exist and what they fix. Then open the simulator: PUT past capacity and watch the eviction; for LFU, GET one key repeatedly so it's no longer the coldest, then PUT a new key and confirm a different key — not the one you favored — gets evicted.

References & further reading#

References

Megiddo & Modha — ARC: A Self-Tuning, Low Overhead Replacement Cache (USENIX FAST 2003) — the adaptive recency/frequency cache used in ZFS
Einziger, Friedman & Manes — TinyLFU: A Highly Efficient Cache Admission Policy — the frequency-sketch admission filter behind W-TinyLFU / Caffeine
Jiang & Zhang — LIRS: An Efficient Low Inter-reference Recency Set Replacement Policy (SIGMETRICS 2002) — reuse-distance ranking; strongly scan- and loop-resistant
Johnson & Shasha — 2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm (VLDB 1994) — the simplest scan-resistant design; ancestor of segmented LRU
Redis — Key eviction (maxmemory-policy, approximate LRU/LFU, Morris counter) — how a real system samples instead of maintaining true order
Caffeine Wiki — Efficiency (W-TinyLFU vs LRU/LFU/ARC on real traces) — hit-rate comparisons that motivate adaptive policies

Ready to try it?

The simulator is a real, deterministic implementation — pick a scenario and step through it, scrubbing the timeline forward and backward through every change.

Try these in the simulator

LRU eviction →Fill the cache, then watch the least-recently-used entry get evicted on the next insert.LFU on the same trace →The exact same operations, but LFU evicts the least-frequently-used key — a different victim than LRU.Hit, miss, update →The three things a cache can do to a key: miss it, store it, then hit or overwrite it.

Open the LRU / LFU Cache simulator →

Up nextCaching Strategies

← Back to the learning path