The big picture#
TL;DRthe 30-second version
- You type a name (example.com); computers route to numbers (an IP address like 93.184.216.34). DNS is the lookup that turns the name into the number before any request is sent.
- A recursive resolver does the work for you. It walks a hierarchy top-down — root → TLD (.com) → authoritative nameserver — where each level only knows the next, until the authoritative server returns the IP (the A record).
- Caching with a TTL makes it fast: the first lookup walks the whole tree; the next one for the same name is a single hop straight from cache.
- The cost of caching is propagation delay — after you change a record, old answers linger until their TTL expires. Bad names return NXDOMAIN; spoofed answers are the security worry that DNSSEC addresses.
Everything below expands on these points. Read the core sections top to bottom for the full picture; the collapsible "Go deeper" boxes hold the advanced bits (query modes, record types, encrypted DNS, anycast) you can skip on a first pass and return to later.
Start here: names are for humans, numbers are for routers#
You remember example.com, wikipedia.org, your bank's website. Humans are good at names and terrible at numbers. The network is the opposite: routers and packets only understand numeric IP addresses, like 93.184.216.34 (IPv4) or 2606:2800:220:1:248:1893:25c8:1946 (IPv6). There is no way to send a packet to the word "example.com" — the address has to be a number.
So before the very first byte of a request can travel anywhere, something must translate the name you typed into an address the network can route to. That translation is DNS — the Domain Name System. The classic metaphor is a phone book: you know a person's name, you look it up, and you get the number you actually dial. DNS is that phone book for the entire internet, and it has to work for billions of names that change constantly.
The resolver walks a hierarchy#
Your machine doesn't know the IP, so it asks a recursive resolver (usually run by your ISP or a service like Cloudflare's 1.1.1.1 or Google's 8.8.8.8) to do the work. The resolver walks a hierarchy from the top down: it asks a root nameserver who runs '.com', the root refers it to the .com TLD nameserver, the TLD refers it to the domain's authoritative nameserver, and only that last server — the one that actually owns the records — returns the IP address.
- Root nameserver: knows who runs each top-level domain (.com, .org, …) — refers, doesn't answer.
- TLD nameserver: knows who is authoritative for each domain under it — refers, doesn't answer.
- Authoritative nameserver: owns the actual records for the domain — answers with the IP (the A record).
- Recursive resolver: does this whole walk on your behalf and hands you the final answer.
PredictOn a completely cold cache, how many query round-trips does the resolver make before it can answer you?
Hint: Count one ask per level of the hierarchy.
Three: root (who runs .com?), TLD (who is authoritative for example.com?), and the authoritative server (what's the A record?). Your own request to the resolver is a fourth hop on top. After that, the answer is cached, so the next lookup of the same name is a single hop — the resolver replies straight from cache.
Caching and TTL make it fast#
Walking the hierarchy every time would be slow — a cold lookup is three serial round-trips to servers that may be far away. So the resolver caches each answer for a duration set by the record's TTL (time-to-live), a number of seconds chosen by the domain owner. The next lookup of the same name — from you or anyone else using that resolver — is a single hop straight back from cache. This is the same cache-vs-store trade-off you'll see everywhere later, applied to name lookups.
- Cold lookup (nothing cached): ~3 referral round-trips down the hierarchy, then the answer — tens to hundreds of milliseconds.
- Warm lookup (cached and unexpired): 1 hop to the resolver, which answers from memory — often under a millisecond on the same network.
- Caching happens at many layers: the browser, the operating system, the resolver, and sometimes the home router each keep their own short-lived copy.
- TTL bounds staleness: a record's TTL is the longest time a cached copy may be served before it must be looked up again. Low TTL = fresher but more lookups; high TTL = fewer lookups but slower to change.
Query modes, record types, and modern DNS#
The core idea — name in, address out — has a few important variations. The most common is the kind of query, and the kind of record you ask for. A DNS answer isn't always an IP: the same system stores several kinds of records, and you ask for the type you need.
- A — maps a name to an IPv4 address. AAAA — maps a name to an IPv6 address.
- CNAME — an alias: 'www.example.com is really example.com, go look that up instead'.
- MX — mail exchange: which server receives email for this domain.
- NS — which nameservers are authoritative for this domain (the delegation glue).
- TXT — arbitrary text, used for domain verification and email anti-spoofing (SPF, DKIM).
Go deeperGo deeper: recursive vs iterative, DoH/DoT, anycast, GeoDNS
Recursive vs iterative queries: when you ask a recursive resolver, you make a recursive query — 'give me the final answer, do whatever it takes'. The resolver then makes iterative queries to each nameserver — 'tell me the answer or tell me who to ask next'. Root and TLD servers only answer iteratively (they refer); they never chase the whole chain for you. Splitting the work this way is what keeps the upper levels cheap: they hand out referrals instead of doing lookups.
Encrypted DNS — DoH and DoT: classic DNS travels in plaintext over UDP port 53, so anyone on the path (your ISP, a coffee-shop network) can see and even tamper with every name you look up. DNS over TLS (DoT, RFC 7858) wraps queries in a TLS tunnel on its own port; DNS over HTTPS (DoH, RFC 8484) sends them as HTTPS requests so they blend in with normal web traffic. Both hide the names you resolve and stop on-path tampering. The trade-off is that resolution now depends on the privacy and availability of whichever resolver you trust.
Anycast — one address, many servers: the root and big public resolvers announce the same IP address from hundreds of locations worldwide using BGP anycast. Your packet to 1.1.1.1 or to a root server is routed to the nearest instance automatically. That is how '13 root servers' actually run on well over a thousand physical machines, and how a public resolver stays fast everywhere and absorbs attacks.
GeoDNS / latency-based routing: because the authoritative server chooses what answer to return, it can return a different IP depending on where the query came from — sending a user in Europe to a European data center and a user in Asia to an Asian one. CDNs lean on this heavily: DNS becomes a load-balancing and traffic-steering tool, not just a lookup. The catch is that the resolver's location, not the user's, is what the authoritative server sees — so a far-away resolver can route you to a far-away server.
The trade-offs#
DNS makes two big bets, and each one has a cost you feel in practice.
- Caching buys speed but costs freshness. A warm lookup is nearly free, but after you change a record, every cached copy keeps serving the old answer until its TTL expires — this is propagation delay. Lower the TTL before a planned change and old answers drain out faster; raise it afterward to cut lookup load.
- A delegated hierarchy buys scale but concentrates trust. Delegation lets billions of names be managed independently without any central bottleneck. But each domain depends on its TLD and authoritative servers being reachable, and on the resolver it asks being honest — so a problem high in the tree, or a compromised resolver, has wide blast radius.
- UDP buys speed but limits size and security. DNS uses connectionless UDP for its tiny query/answer so there's no handshake — but UDP is easy to spoof and answers must be small, which pushed the design toward DNSSEC (signing) and encrypted transports (DoH/DoT) layered on top.
How DNS fails#
- NXDOMAIN — the name doesn't exist. A typo, an expired domain, or a record that was never created: the authoritative server says 'no such name' and the lookup fails before any connection.
- Stale cache after a change — you updated a record but users still hit the old IP. They're being served a cached answer whose TTL hasn't expired yet; nothing is broken, you just have to wait it out (or you forgot to lower the TTL beforehand).
- Cache poisoning / spoofing — because classic DNS is unauthenticated UDP, an attacker who can forge a reply faster than the real one can plant a wrong IP in a resolver's cache, silently sending users to a malicious server. DNSSEC (RFC 4033) defends against this by cryptographically signing records so forged answers are rejected.
- Resolver or authoritative outage — if your resolver is down you can't look up anything; if a domain's authoritative servers are unreachable, that domain effectively disappears even though its web servers are fine.
How the pieces compare#
| Recursive resolver | Authoritative server | |
|---|---|---|
| Role | Asks on your behalf | Holds the real records |
| Answers with | Whatever it finds/caches | Its own zone's data only |
| Caches? | Yes — that's its job | No — it is the source |
| Examples | 1.1.1.1, 8.8.8.8, ISP | Route 53, a domain's NS |
| Hosts file | DNS | |
|---|---|---|
| Where | One file on your machine | Distributed worldwide |
| Scope | Only your computer | The whole internet |
| Updates | Edit by hand | Owner edits, TTL propagates |
| Scale | A handful of names | Billions of names |
| Transport | Port | Encrypted? | Note |
|---|---|---|---|
| UDP | 53 | No | Default; fast, small answers |
| TCP | 53 | No | Fallback for big answers / zone transfer |
| DoT | 853 | Yes | DNS in a TLS tunnel (RFC 7858) |
| DoH | 443 | Yes | DNS as HTTPS, blends in (RFC 8484) |
DNS in the wild#
- The root is run as 13 named server systems (labeled a through m), operated by twelve organizations. Each 'server' is really hundreds of machines worldwide sharing one address via anycast, so the root is far more resilient than '13 boxes' suggests.
- Public recursive resolvers: Cloudflare's 1.1.1.1 and Google's 8.8.8.8 are anycast resolvers anyone can use, often faster and more private than an ISP's default.
- Managed authoritative DNS: AWS Route 53, Cloudflare, and NS1 host the authoritative records for huge numbers of domains, adding health checks and latency-based routing.
- CDNs use DNS as a steering layer: by returning a different IP per region (GeoDNS), providers like Akamai, Cloudflare, and Fastly send each user to a nearby edge — DNS doing load balancing, not just lookup.
- The 2016 Dyn outage showed the flip side: knock out the authoritative DNS and the sites behind it become unreachable even while perfectly healthy.
Common questions & gotchas#
Why does a DNS change take time to 'propagate'?
Nothing is actively pushed out. Resolvers all over the world cached the old answer, and each keeps serving it until that cached copy's TTL expires. 'Propagation' is really just caches timing out one by one. Lower the record's TTL a day before a planned change so the old answers drain quickly.
Recursive resolver vs authoritative server — what's the difference?
A recursive resolver asks on your behalf, walking the hierarchy and caching what it learns; it doesn't own any records. An authoritative server is the source of truth for a specific domain's records and answers only for that domain. The resolver is the librarian; the authoritative server is the shelf the book actually sits on.
What is a TTL?
Time-to-live: a number of seconds, set by the domain owner on each record, that tells resolvers how long they may cache the answer before looking it up again. It's the dial that trades freshness (low TTL, fast changes) against load (high TTL, fewer lookups).
What is NXDOMAIN?
The 'non-existent domain' response: authoritatively, no such name exists. You get it for typos, expired domains, or names that were never created. The lookup fails immediately — there's no IP to connect to.
QuizYou change your site's IP and lower the TTL to 60s, but some users still hit the old server for hours. Most likely cause?
- DNS is broken and needs a restart
- The TTL change itself was cached at the old (long) value, so old copies linger until that expires
- NXDOMAIN is being returned
- The authoritative server never got the update
Show answer
The TTL change itself was cached at the old (long) value, so old copies linger until that expires — Lowering the TTL only affects answers handed out after the change. Copies cached before it still carry the old, longer TTL and won't refresh until that expires — so the lower value only helps the next change, not this one. This is why you lower the TTL well ahead of a planned cutover, not at the moment of it.
In an interview#
Lead with the one-liner: DNS turns a human-readable name into a routable IP address before any request is sent, and it does it through a delegated hierarchy plus caching. Then trace the walk: the recursive resolver asks root → TLD → authoritative, where each level only knows the next, and the authoritative server returns the A record.
Show you understand the trade-off: caching with TTL makes warm lookups a single hop, but the price is propagation delay when records change. Name the dial (TTL), the failure modes (NXDOMAIN, stale cache, spoofing → DNSSEC), and the real systems (1.1.1.1, 8.8.8.8, Route 53, the anycast root). If the question is about fast failover or geo-routing, mention low TTLs and GeoDNS; if it's about availability, cite Dyn 2016 and multi-provider authoritative DNS.
Then open the simulator: watch a cold lookup walk root → TLD → authoritative, see the answer get cached, then run the same lookup again and watch it return in a single hop.
References & further reading#
- RFC 1034 — Domain Names: Concepts and Facilities — the foundational DNS design (hierarchy, delegation, caching)
- RFC 1035 — Domain Names: Implementation and Specification — the wire format, record types, and resolver behavior
- Cloudflare Learning — What is DNS? — clear beginner-friendly walkthrough of the lookup
- RFC 8484 — DNS Queries over HTTPS (DoH) — the privacy-preserving HTTPS transport for DNS
- RFC 7858 — DNS over TLS (DoT) — the dedicated TLS transport for DNS
- RFC 4033 — DNS Security Introduction (DNSSEC) — signing records to defend against spoofing/poisoning
- IANA — Root Servers — the 13 root server identities and operators
- Cloudflare — announcing the 1.1.1.1 resolver — a real public anycast recursive resolver, explained
- AWS — Amazon Route 53 Developer Guide — managed authoritative DNS with health checks & geo routing
- Wikipedia — 2016 Dyn cyberattack — the DNS outage that took major sites offline
Ready to try it?
The simulator is a real, deterministic implementation — pick a scenario and step through it, scrubbing the timeline forward and backward through every change.