HTTP & APIs · System Internals

The big picture#

TL;DRthe 30-second version

HTTP is a request/response protocol: the client sends a request (method + path + headers + optional body), the server sends a response (status line + headers + optional body). It rides on top of a TCP (or QUIC) connection.
The method says what action you want — GET reads, POST creates, PUT replaces, PATCH partially updates, DELETE removes. GET/HEAD are safe (read-only); GET, HEAD, PUT, DELETE are idempotent (repeating them is harmless); POST is neither.
The status code is the verdict, and the first digit tells the story: 1xx info, 2xx success, 3xx redirect, 4xx you (client) messed up, 5xx the server broke. 200, 201, 204, 301, 304, 400, 401, 403, 404, 429, 500, 503 are the ones to know.
HTTP is stateless — every request stands alone, carrying its own context (cookies, an Authorization token). Headers like Content-Type, Cache-Control, and ETag/If-None-Match make it cacheable and efficient. REST, RPC/gRPC, and GraphQL are three styles of API built on this same protocol.

Everything below expands on these points. Read the core sections top to bottom for the full mental model; the collapsible "Go deeper" boxes hold the advanced bits (conditional requests, idempotency keys, REST vs gRPC vs GraphQL, HTTP/1.1 vs /2 vs /3) you can skip on a first pass and return to later.

clientbrowser / app

request →← response

serveranswers

One HTTP exchange: client asks, server answers

Part	Request	Response
First line	GET /users/42 HTTP/1.1 — method · path · version	HTTP/1.1 200 OK — version · 3-digit status · reason
Headers	Host, Accept, Authorization: Bearer …	Content-Type, Cache-Control, ETag
Body	(none for a GET)	{"id":42,"name":"Ada"}

Start here: why a shared language was needed#

TCP hands you a reliable pipe of bytes, but bytes alone are meaningless. If your browser opens a connection to a web server and just starts sending data, the server has no idea what those bytes mean. Is this a request for a page? An upload? A login? Where's the address of the thing you want? What format do you expect back? Two programs that have never met need to agree, in advance, on a structure for their messages — a protocol.

HTTP (HyperText Transfer Protocol) is that agreement for the web. It defines exactly how a request is laid out (a request line, then headers, then an optional body) and how a response is laid out (a status line, then headers, then an optional body). Because every client and every server follows the same rules, any browser can talk to any web server it has never seen before. That universality — one simple, text-based, human-readable format that everyone implements — is the reason the web could grow into a single interoperable system instead of millions of incompatible islands.

Why text and not a clever binary format?HTTP/1.x is deliberately plain text you can read with your own eyes: 'GET /home HTTP/1.1' is just a line you could type. That readability made it trivial to debug, implement, and extend — anyone could telnet to a server and speak it by hand. The web traded a little efficiency for massive interoperability, and it won. (Later versions, HTTP/2 and /3, switch to a compact binary framing for speed — but the same concepts of methods, paths, headers, and status codes carry straight over.)

Anatomy of a request and a response#

A request and a response are each just a small, structured message with three parts. Once you can name those parts, every HTTP interaction stops being magic. If you've ever opened your browser's developer tools and watched the Network tab, you've been staring at exactly these pieces.

Request line: the method, the path, and the protocol version — e.g. 'GET /users/42 HTTP/1.1'. The method is the verb (what to do), the path is the noun (which resource), the version is which dialect of HTTP.
Request headers: 'Name: value' lines carrying metadata — Host (which site), Accept (what formats I'll take back), Content-Type (what format my body is in), Authorization (my credentials), User-Agent (what I am).
Request body (optional): the actual data being sent, for methods like POST/PUT/PATCH — a JSON document, form fields, or an uploaded file. GET requests normally have no body.
Status line: the response opens with the version, a 3-digit status code, and a short reason phrase — 'HTTP/1.1 200 OK' or 'HTTP/1.1 404 Not Found'. The client reads this first to know if it worked.
Response headers + body: headers describe the response (Content-Type, Content-Length, Cache-Control, Set-Cookie), and the body is the content itself — HTML, JSON, an image, or nothing at all (e.g. a 204 No Content).

The whole protocol in one sentenceClient sends 'method + path + headers + optional body'; server replies 'status code + headers + optional body'. Everything else — REST, GraphQL, caching, auth, cookies — is a convention layered on top of that one shape.

Go deeperGo deeper: what's literally on the wire

A raw HTTP/1.1 request is plain ASCII text. The first line is the request line; then come header lines, one per line; then a single blank line (CRLF) signals the end of the headers; then the optional body. The response mirrors it exactly: status line, headers, blank line, body. The blank line is load-bearing — it's how the receiver knows where the headers stop and the body begins. Content-Length (or chunked transfer-encoding) then tells it how many body bytes to read.

Remember the layering from the earlier topics: TCP gives a reliable byte stream, TLS (for HTTPS) encrypts that stream, and HTTP gives the stream request/response structure. A single HTTPS request therefore costs a TCP handshake, then a TLS handshake, then the HTTP exchange — which is exactly why reusing one connection for many requests (keep-alive, and HTTP/2 multiplexing) matters so much.

Methods: the verbs, and the safe/idempotent rules#

The method is the verb of the request — it declares your intent. There are a handful in common use, and two properties decide how you're allowed to treat them: whether a method is safe (read-only, no side effects) and whether it's idempotent (sending it N times has the same effect as sending it once). These aren't pedantic labels — they're what makes safe retries, caching, and crash recovery possible.

Method	Intent	Safe?	Idempotent?	Typical body
GET	Read a resource	Yes	Yes	None
HEAD	Read headers only (no body)	Yes	Yes	None
POST	Create / submit / 'do something'	No	No	Yes
PUT	Replace a resource wholesale	No	Yes	Yes
PATCH	Partially update a resource	No	No	Yes
DELETE	Remove a resource	No	Yes	None

Safe means read-only: GET and HEAD must not change server state, so a crawler or a prefetcher can fire them freely. If a GET mutates data, you've built a trap — caches and bots will trigger it.
Idempotent means repeat-safe: GET, HEAD, PUT, and DELETE land in the same final state no matter how many times you send them. PUT /users/42 {name:'Ada'} sets the record to that value whether you send it once or five times; DELETE /users/42 leaves it deleted either way.
POST is the odd one out — neither safe nor idempotent. POST /orders twice usually creates two orders. That's why a lost response to a POST is dangerous to blindly retry, and why idempotency keys exist (see the deep dive below).
PATCH is also non-idempotent in general (e.g. 'increment balance by 10' applied twice differs from once), though many PATCH bodies happen to be idempotent in practice.

Go deeperGo deeper: idempotency keys and safe retries

The failure model from client↔server bites hardest here: a client sends 'POST /payments', the server charges the card, but the response is lost on the way back. The client sees only silence and can't tell 'never happened' from 'happened, reply lost'. Retrying a non-idempotent POST risks charging twice.

The fix is an idempotency key: the client generates a unique ID and sends it in a header (e.g. 'Idempotency-Key: 7f3a…') with the POST. The server records the key the first time it processes the request and, if it sees the same key again, returns the original result instead of doing the work twice. This is how payment APIs like Stripe make POST safe to retry. The general principle: make an operation idempotent (by design or by a dedup key) and retries become trustworthy — the same lesson you met in client↔server, now made concrete.

Status codes: the 3-digit verdict#

Every response opens with a status code, and the first digit tells you the whole category before you read another byte. Learn the five classes and a dozen common members and you can read almost any HTTP error at a glance.

Class	Meaning	Common members
1xx	Informational — interim, keep going	100 Continue, 101 Switching Protocols
2xx	Success — it worked	200 OK, 201 Created, 204 No Content
3xx	Redirect — look elsewhere	301 Moved Permanently, 302 Found, 304 Not Modified
4xx	Client error — you messed up	400, 401, 403, 404, 409, 429
5xx	Server error — it broke	500, 502, 503, 504

200 OK — success with a body. 201 Created — a POST/PUT made a new resource (often with a Location header pointing to it). 204 No Content — success, deliberately no body (common for DELETE).
301 Moved Permanently — the resource now lives at a new URL forever (caches/browsers remember it). 304 Not Modified — your cached copy is still fresh, here's no body, reuse what you have (the conditional-request payoff, below).
400 Bad Request — malformed/invalid input. 401 Unauthorized — you're not authenticated (no/invalid credentials). 403 Forbidden — authenticated but not allowed. 404 Not Found — no such resource. 429 Too Many Requests — you're rate-limited, slow down (often with a Retry-After header).
500 Internal Server Error — an unhandled bug on the server. 503 Service Unavailable — temporarily overloaded or down (also often with Retry-After). 502/504 — a gateway/proxy got a bad or no response from an upstream server.

401 vs 403 — the classic mix-up401 Unauthorized actually means unauthenticated — 'I don't know who you are; send credentials'. 403 Forbidden means authenticated but unauthorized — 'I know who you are, and you still can't do this'. The names are historically backwards; remember it by the question each answers: 401 = who are you?, 403 = you're not allowed.

Headers, caching, and conditional requests#

There's no Big-O for 'send an HTTP request' — the cost that matters is round-trips and bytes over the wire, exactly as in client↔server. Headers are the levers that let HTTP cut both. The single biggest one is caching: if the client (or a CDN in between) can reuse a previous response, you skip the request entirely, or shrink it to a tiny 'still fresh?' check.

Content-Type — what format the body is (application/json, text/html, image/png). The matching Accept request header says what the client is willing to receive.
Cache-Control — the caching policy: 'max-age=60' (fresh for 60s), 'no-store' (never cache), 'private' (only the browser, not shared CDNs). This decides whether a response can be reused at all.
ETag — a short fingerprint (version tag) the server attaches to a response body. If the body changes, the ETag changes.
Authorization — the caller's credentials, typically 'Bearer <token>' for an API token or a session. Sent on every request because HTTP is stateless (see below).

Conditional requests turn a cache into a near-free freshness check. After the client has a cached copy tagged ETag "a1b2c3", its next request includes 'If-None-Match: "a1b2c3"'. If the resource hasn't changed, the server replies '304 Not Modified' with no body — the client reuses what it already has. Only if the resource changed does the server send a full 200 with the new body. A 304 is a tiny header-only round-trip instead of re-downloading the whole payload.

Clienthas a cacheServerowns the body

GET /img — first visit

200 OK · ETag "v7" · 2 MB body

client caches the body + its ETag

later — is it still fresh?

GET /img · If-None-Match: "v7"

ETag still matches → unchanged

304 Not Modified · no body

Conditional GET: the ETag → 304 dance

PredictA 2 MB image is cached with ETag "v7". The client revalidates with 'If-None-Match: "v7"' and the image hasn't changed. Roughly how many bytes of body come back, and what's the status?

Hint: What does 'Not Modified' mean for the body?

Zero bytes of body, and the status is 304 Not Modified. The whole point of the ETag + If-None-Match dance is that an unchanged resource costs one small round-trip of headers (a few hundred bytes) instead of re-sending the 2 MB. If the image had changed, you'd instead get a 200 OK with the full new 2 MB body and a new ETag. This is how browsers and CDNs avoid re-downloading unchanged assets on every visit.

Go deeperGo deeper: statelessness and where the state actually goes

HTTP is stateless: the protocol itself remembers nothing between requests. Each request must carry everything the server needs to handle it — there's no implicit 'we were in the middle of something'. That sounds limiting, but it's a feature: because any request is self-contained, any server replica can handle any request, which is what makes load balancing and horizontal scaling easy (a later topic).

So how do logins and shopping carts work if nothing is remembered? The state is carried explicitly. A cookie (set by the server via Set-Cookie, returned by the client on every later request) or a token in the Authorization header travels with each request and identifies the session. The actual session data lives in a shared store (a database or cache) the stateless servers all read from — or is signed into the token itself (a JWT). The server stays stateless; the client carries the thread of continuity.

API styles built on HTTP: REST, RPC/gRPC, GraphQL#

Raw HTTP gives you methods, paths, and status codes. An API style is a convention for how to use them — how requests are named and shaped, and what the payloads look like. Three dominate, and they sit at different points on the same protocol.

Go deeperGo deeper: how each style thinks

REST treats everything as a resource with a URL and uses HTTP methods as the verbs and status codes as the outcomes: GET /users/42 reads, POST /users creates, PUT /users/42 replaces, DELETE /users/42 removes, and a 201 or 404 reports what happened. It leans fully into HTTP, which makes it cache-friendly (GETs cache naturally), uniform, and the default for public web APIs. The downside: a rich screen may need several round-trips (one per resource), and an endpoint often returns more fields than you need (over-fetching) or too few (under-fetching).

RPC (Remote Procedure Call) flips the model from nouns to verbs: instead of 'fetch this resource' you 'call this function on the server' — getUser(42) — as if it were local. gRPC is the popular modern incarnation: it serializes messages with Protocol Buffers (a compact binary schema-defined format) and runs over HTTP/2, which gives it multiplexing and first-class streaming (client-, server-, and bidirectional streams). It's fast and strongly typed from a shared .proto schema, making it a favorite for internal service-to-service calls — at the cost of being binary (not human-readable) and harder to cache or call straight from a browser.

GraphQL takes a third angle: a single endpoint (usually POST /graphql) where the client sends a query describing exactly the fields it wants, possibly spanning what would be several REST endpoints, and the server returns precisely that shape. It kills over- and under-fetching and is great for rich, evolving UIs — at the cost of more complex server execution, weaker HTTP caching (most queries are POSTs to one URL), and the risk of expensive client-crafted queries.

They're not mutually exclusiveBig systems mix them: a public REST or GraphQL API at the edge for clients, and gRPC between internal microservices where speed and typed contracts matter most. The choice is per-boundary, not per-company.

What HTTP buys, and what it costs#

HTTP won because of what it gives you almost for free; its costs are the flip side of those same choices.

Strength — universality & simplicity: one text-based request/response format every client and server implements, so anything can talk to anything. Strength — statelessness: self-contained requests make caching, load balancing, and horizontal scaling natural.
Strength — built-in semantics: methods, status codes, and caching headers give you a shared vocabulary for reads/writes, errors, and freshness without inventing your own.
Cost — verbosity: text headers repeated on every request add bytes; HTTP/1.1 especially is chatty. HTTP/2's header compression and binary framing exist largely to claw this back.
Cost — round-trips & head-of-line blocking: each resource can be its own round-trip, and HTTP/1.1 serializes requests on a connection so one slow response stalls the rest. This drove HTTP/2 multiplexing and then HTTP/3.
Cost — caching has to be earned: statelessness enables caching, but getting Cache-Control/ETag right is subtle, and stale or wrongly-cached responses are a classic source of bugs.

Statelessness is the load-bearing tradeBy refusing to remember anything between requests, HTTP makes each request portable to any server — the foundation the entire scaling/replication/load-balancing curriculum is built on. The price is that clients must re-send context (cookies, tokens) every single time, and truly stateful interactions (sessions, carts) need an explicit store behind the stateless front.

HTTP/1.1 vs HTTP/2 vs HTTP/3#

The semantics — methods, status codes, headers — have stayed remarkably stable across versions. What changed is how the messages are framed and carried, each version attacking the round-trip and head-of-line-blocking costs of the last. This is where HTTP meets the TCP/UDP topic head-on.

HTTP/1.1 — text-based, one request at a time per connection (with keep-alive to reuse the connection). Parallelism means opening several TCP connections. A slow response blocks everything queued behind it on that connection (head-of-line blocking).
HTTP/2 — binary framing and multiplexing: many concurrent requests (streams) share one TCP connection, plus header compression (HPACK). It fixes HTTP-level head-of-line blocking, but because it still rides one TCP stream, a single lost packet stalls all streams (TCP-level head-of-line blocking).
HTTP/3 — runs over QUIC, which is built on UDP (not TCP). QUIC gives each stream its own delivery, so one lost packet only stalls its own stream, and it folds the transport + TLS handshake together to connect in fewer round-trips. This is the payoff of the TCP-vs-UDP trade-off: rebuild reliability on top of UDP to escape TCP's single-stream stalling.

	REST	RPC / gRPC	GraphQL
Transport	HTTP/1.1 or /2, many endpoints	HTTP/2 (required)	HTTP, one endpoint (usually POST /graphql)
Payload	Usually JSON (text)	Protobuf (compact binary)	JSON, shaped by the query
Schema	Optional (OpenAPI, by convention)	Required (.proto), strongly typed	Required (GraphQL schema/SDL)
Streaming	Limited (SSE, long-poll)	First-class (client/server/bidi)	Subscriptions (often over WebSocket)
Caching	Excellent — native HTTP GET caching	Weak — not HTTP-cache friendly	Weak — POSTs to one URL, needs app-level
Best use	Public web APIs, simple CRUD	Fast internal service-to-service	Rich UIs avoiding over-/under-fetching

HTTP in the wild#

Loading a web page: your browser fires dozens of HTTP requests — the HTML, then CSS, JS, images, fonts — each a GET, many answered from cache with 304s or straight from a CDN. The Network tab shows every method, status, and header.
Mobile apps → REST/GraphQL APIs: the app is a client making HTTP requests to backend APIs to load a feed (GET), post a message (POST), or sync data — with an Authorization token on each request because the server is stateless.
CDNs and caching: Cloudflare, Akamai, and Fastly sit between client and origin, serving cacheable responses (driven by Cache-Control/ETag) without ever touching the origin server — round-trips removed at planetary scale.
Internal microservices: large backends use gRPC over HTTP/2 between services for speed and typed contracts, while exposing REST or GraphQL at the public edge.
Rate limiting & auth in headers: APIs return 429 Too Many Requests with Retry-After when you exceed your quota, 401 when your token is missing/expired, and 403 when you lack permission — all standardized so any client can react.

You already use the layered stackA single tap nests every earlier topic: DNS resolves the API hostname to an IP, TCP (or QUIC) opens a connection, TLS encrypts it, and HTTP carries the request/response. HTTP is the top layer you actually read and write; the others quietly carry it.

Common questions & gotchas#

What's the real difference between PUT and POST?

Intent and idempotency. PUT replaces a resource at a known URL and is idempotent — PUT /users/42 with the same body any number of times leaves user 42 in the same state. POST means 'create or do something' and is not idempotent — POST /users twice typically creates two users. Use PUT when the client decides the resource's identity/URL; use POST when the server does (e.g. assigns a new ID).

Is GET really guaranteed to not change anything?

By the spec, yes — GET is 'safe', meaning read-only with no side effects. The protocol can't enforce it, but you must honor it: browsers prefetch, crawlers crawl, and proxies cache GETs freely. If a GET mutates state, those automated readers will trigger the mutation unexpectedly. Put any state change behind POST/PUT/PATCH/DELETE.

Why send the Authorization header on every request? Doesn't the server remember me?

No — HTTP is stateless, so the server remembers nothing between requests. Each request must carry its own proof of identity (a token or cookie). That's precisely what lets any server replica handle any request, which is what makes horizontal scaling easy. The session data itself lives in a shared store or is signed into the token.

When is it safe to retry a failed request?

Safe when the method is idempotent (GET, PUT, DELETE) — repeating it can't double-apply. Risky for POST, because a lost response might mean the action already succeeded; retrying could duplicate it. The fix is an idempotency key the server uses to deduplicate, which is how payment APIs make POST retry-safe.

QuizA client POSTs a new order, gets no response (timeout), and retries. Two orders are created. What's the correct fix?

Switch the endpoint to GET so it can be retried safely
Increase the client timeout so it waits longer
Attach an idempotency key so the server recognizes and ignores the duplicate POST
Never retry any request that failed

Show answer

Attach an idempotency key so the server recognizes and ignores the duplicate POST — The first POST actually succeeded — only its response was lost — so the retry created a second order. GET is wrong because creating an order is a state change and must not be a 'safe' method. A longer timeout doesn't help once the work is already done. The right fix is an idempotency key: the client sends a unique key with the POST, and the server records it so a retry with the same key returns the original result instead of creating a duplicate.

In an interview#

Lead with the shape: HTTP is a stateless request/response protocol over a TCP/QUIC connection — the client sends method + path + headers + optional body, the server replies status code + headers + optional body. Name the method semantics (GET/HEAD safe; GET/PUT/DELETE idempotent; POST neither) and the status classes (2xx ok, 3xx redirect, 4xx client error, 5xx server error), because that vocabulary signals fluency instantly.

Then show depth on the things that have consequences: statelessness (why it makes scaling and load balancing easy, and where session state actually goes), idempotency + idempotency keys (how to retry safely after a lost response), and caching via Cache-Control + ETag/If-None-Match → 304 (how to cut round-trips and bytes). If asked to design an API, contrast REST (resources + verbs + status codes, cacheable, public-facing), gRPC (Protobuf over HTTP/2, streaming, fast internal calls), and GraphQL (one endpoint, client-specified queries, no over-/under-fetching).

Close by connecting down the stack: HTTP/1.1 → HTTP/2 (multiplexing over one TCP connection) → HTTP/3 (QUIC over UDP to escape TCP head-of-line blocking) ties straight back to the TCP-vs-UDP trade-off, and the whole exchange sits on top of DNS resolution and the client↔server round-trip you already know. Then open the simulator and watch a request's method, headers, and status code flow end to end.

References & further reading#

References

MDN — An overview of HTTP — the gentlest tour of messages, methods, status codes, and the HTTP flow
MDN — HTTP request methods — every method with its safe / idempotent / cacheable properties
MDN — HTTP response status codes — the full 1xx–5xx list with each code explained
RFC 9110 — HTTP Semantics — the authoritative spec for methods, status codes, and headers
RFC 9111 — HTTP Caching — Cache-Control, ETag, conditional requests, and the 304 flow
Roy Fielding — REST (dissertation, ch. 5) — the original definition of the REST architectural style
gRPC — Introduction to gRPC — Protobuf over HTTP/2 with first-class streaming, explained
GraphQL — Learn GraphQL — single endpoint, client-specified queries, schema-driven
MDN — HTTP conditional requests — ETag / If-None-Match and how a 304 saves a full download
Cloudflare Learning — What is HTTP/3? — QUIC over UDP and the end of TCP head-of-line blocking

Ready to try it?

The simulator is a real, deterministic implementation — pick a scenario and step through it, scrubbing the timeline forward and backward through every change.

Try these in the simulator

A tour of status codes →One of each: a 200, a 304 revalidation, a 201 create, a 404, and a 503 that an idempotent retry recovers.Conditional GET (ETag → 304) →The first GET is cached with an ETag; the repeat revalidates and comes back 304 Not Modified — no body resent.Idempotent retry past a 503 →A GET hits a 503, and because GET is idempotent it is safely retried and succeeds with 200.Create and not-found →A POST creates a resource (201) — not safe, not idempotent — and a GET to an unknown path returns 404.

Open the HTTP & APIs simulator →

Up nextConcurrency vs Parallelism

← Back to the learning path