System Internals

No timeline here — just the three numbers you estimate with on every design. Move the controls and watch what they imply.

Latency you can feel

Computers operate on timescales we can't picture. Scale every operation so that 1 nanosecond = 1 second of human time, and the gulf between cache, RAM, disk, and network becomes obvious.

Operation	Real latency	If 1 ns = 1 second
1 CPU cycle	0.30 ns	3.0e+2 ms
L1 cache reference	1.0 ns	1 s
L2 cache reference	4.0 ns	4 s
Mutex lock / unlock	17 ns	17 s
Main memory (RAM) reference	1.0e+2 ns	1.67 min
Read 1 MB sequentially from RAM	3.0 µs	50 min
SSD random read (4 KB)	16 µs	4.44 hours
Round trip within a datacenter	5.0e+2 µs	5.79 days
Read 1 MB sequentially from SSD	1.0 ms	11.6 days
Disk (HDD) seek	10 ms	116 days
Round trip California → Netherlands	1.5e+2 ms	4.76 years

Compare an operation

One “ California → Netherlands” costs about 1,500,000× a single RAM reference. 150 ms — the speed of light is the hard limit. This is why fast systems do everything they can to stay in cache and RAM and to avoid disk seeks and cross-continent round-trips.

Little's Law — L = λ · W

The one formula that ties concurrency, throughput, and latency together — with no assumptions about the workload. The number of requests in flight (L) equals the arrival rate (λ) times how long each stays in the system (W).

Solve for

λ — arrival rate (req/s)W — time in system (ms)

100 requests in flight at once

Read it as: a service taking 50 ms per request under 2,000 req/s must handle that many requests at once — so it needs that many threads, connections, or async slots. Cut and the you must provision falls in proportion. This is how you size thread pools and connection limits before you ever load-test.

Nines → downtime

Availability targets are quoted in “nines”. Each extra nine sounds tiny but cuts the allowed downtime by ~10× — and costs far more than 10× to achieve.

Availability target: 99.9% (3 nines)

1.44 minper day

43.2 minper month

8.76 hoursper year

99.9% allows about 8.76 hours of downtime a year. Going from three to four means your total yearly outage budget — deploys, crashes, incidents combined — shrinks to under an hour, which usually forces redundancy, automated failover, and an on-call team. That's why each nine is a real cost decision, captured in an you promise and a stricter you operate to.