HTTP, the protocol
your servers speak.
A first-principles walkthrough of the application-layer protocol every backend touches — from statelessness and method semantics to caching, conditional requests, proxies, and TLS. Written to explain not just what each piece does but why it exists and how it works underneath. Implementations in both Go and Python, grounded in MDN and RFC 9110.
Core Principles of HTTP
HTTP (HyperText Transfer Protocol) is an application-layer protocol — Layer 7 of the OSI
model — for transmitting hypermedia and, just as importantly for us, for programmatic access to APIs and
machine-to-machine communication. "Application layer" means it's the topmost layer, the one that defines the
meaning of a message (this is a GET for /users/5) rather than how bytes physically move
across wires. As backend engineers we live almost entirely here; the layers below (TCP segments, IP packets,
Ethernet frames) are abstracted away as "a reliable pipe." Two ideas sit at HTTP's heart.
1 · Statelessness
The server keeps no session memory between two requests. Every request is fully self-contained: it must carry everything the server needs to process it — auth tokens, cookies, the target resource, the method, any preferences. The moment the server finishes responding, it discards all knowledge of that request. A second request from the same client a millisecond later is treated as a brand-new, completely unrelated event. The protocol has amnesia by design.
Why statelessness is a deliberate feature, not a limitation
It would seem simpler for a server to "remember" you. Statelessness is chosen precisely because forgetting buys three properties that matter enormously at scale:
- Simplicity — the server doesn't allocate, track, expire, or garbage-collect per-user session state in memory. Each request is handled in isolation, so the handler logic is a pure function of its inputs. Fewer moving parts means fewer bugs.
- Horizontal scalability — because no server "owns" a conversation, any request can be routed to any server in a fleet. A load balancer can spray ten requests from one user across ten different machines and each handles its request correctly, because each request carries its own context. Without statelessness you'd need "sticky sessions" pinning each user to one box, which wastes capacity and breaks when that box dies.
- Crash resilience — if a node dies mid-flight, there is no in-memory session to reconstruct. The client simply retries, the load balancer sends it elsewhere, and nothing is lost. Stateful servers, by contrast, lose every active session when they crash.
The tradeoff is that continuity must be rebuilt on every request. Since the protocol itself forgets, we re-establish identity each time using cookies, server-side sessions, or tokens — re-sending a small credential with every request so the server can re-look-up who you are. Those state-management techniques are detailed in §8; the key point here is that they sit on top of a stateless protocol rather than changing its nature.
2 · The Client–Server Model
Communication is always initiated by the client — a browser, a mobile app, or another backend service. The client decides what it wants, assembles a complete request (URL, method, headers, body), and sends it. The server is fundamentally passive: it hosts resources, listens, and only ever speaks in response to a request. It cannot, in classic HTTP, spontaneously push data to a client that hasn't asked. This asymmetry is why "the server can't notify the browser of a new message" was historically solved with polling, and why WebSockets and Server-Sent Events (§19, §17) exist to break out of the strict request/response shape when you genuinely need server push.
The transport beneath: why TCP
HTTP doesn't move bytes itself; it delegates that to a transport protocol. It has exactly one hard requirement of that transport: reliability. Messages must arrive intact and in order, or the transport must signal an error — HTTP cannot tolerate silently dropped or reordered bytes, because a half-delivered header or body is meaningless. The internet's two dominant transports are TCP and UDP:
- TCP is connection-based and reliable: it performs a 3-way handshake (SYN → SYN-ACK → ACK) to establish a connection, numbers every byte, acknowledges receipt, retransmits losses, and delivers data in order. This is what HTTP/1 and HTTP/2 ride on.
- UDP is connectionless and "fire-and-forget" — fast but with no delivery guarantees. Raw HTTP can't use it directly, but HTTP/3's QUIC layer rebuilds reliability on top of UDP (§2).
The TCP handshake and any TLS negotiation happen below Layer 7 — they're transport/network-engineering concerns. We treat them as: "a reliable, possibly encrypted pipe was established, and now request and response messages flow over it." That abstraction is enough to reason about and debug the overwhelming majority of backend behavior.
Three properties that classify every method
Three precise terms from the spec recur throughout this manual and govern how methods behave, how caches treat them, and whether clients may retry them. They're previewed here and dissected in §6:
- Safe — the request is read-only; issuing it does not change server state (e.g. GET, HEAD). Crawlers and link prefetchers rely on this.
- Idempotent — issuing the request once or N times leaves the server in the same final state (e.g. GET, PUT, DELETE). This is the property that makes a request safe to retry after a timeout.
- Cacheable — the response may be stored and reused to satisfy a later equivalent request (e.g. GET, HEAD), saving a round trip.
The Evolution of HTTP
Every HTTP version kept the same request/response semantics but reworked how the underlying connection is used, each chasing lower latency. Understanding the progression explains why certain performance problems exist and how modern HTTP solves them.
| Version | Year | Key change & connection behavior |
|---|---|---|
| HTTP/0.9 | 1991 | A one-line protocol: GET /page and the response was raw HTML. No headers, no status
codes, no other methods, no metadata of any kind. |
| HTTP/1.0 | 1996 | Added headers, status codes, POST, and content types — the structure we still use. But
it opened a fresh TCP connection for every request and closed it after the response, paying
the full handshake cost each time. |
| HTTP/1.1 | 1997 | Persistent connections (keep-alive) by default, pipelining, chunked transfer encoding,
the mandatory Host header, and richer caching. The dominant version for two decades. |
| HTTP/2 | 2015 | Multiplexing, binary framing, header compression (HPACK), and server push. |
| HTTP/3 | 2022 | Abandons TCP for QUIC over UDP. |
The problems each version solved
1.0's wastefulness. Establishing a TCP connection costs a round trip (the handshake) before a single byte of HTTP flows. Doing that for every image, script, and stylesheet on a page — dozens of connections — added enormous latency. 1.1's persistent connections fixed this by reusing one TCP connection for many request/response pairs (§16), amortizing the handshake.
The Host header. 1.1 made Host mandatory, which enabled virtual
hosting: one IP address and one server can host many domains, routing each request by its
Host header. Essentially the entire shared-hosting and reverse-proxy world depends on this.
Head-of-line blocking. Even with persistent connections, HTTP/1.1 processes requests on a connection serially — response 2 can't start until response 1 finishes. A single slow response stalls everything queued behind it. HTTP/2's multiplexing solves this at the application layer: many independent streams share one connection, and frames from different streams interleave, so a slow response no longer blocks others. HTTP/2 also switched from verbose text to a compact binary framing layer and added HPACK header compression (repeated headers like cookies and user-agents are sent once and referenced), plus server push (proactively sending resources the client will need — now largely deprecated in favor of preload hints).
TCP's own head-of-line blocking. HTTP/2 removed it at the HTTP layer, but a deeper problem remained: because all streams share one TCP connection, a single lost TCP packet stalls every stream until it's retransmitted, since TCP guarantees in-order delivery of the whole byte stream. HTTP/3's QUIC fixes this by running over UDP and implementing per-stream reliability itself: a lost packet only blocks the stream it belonged to. QUIC also folds the TLS handshake into the connection setup, so a secure connection establishes in fewer round trips (often zero on resumption).
Across every version the mental model is identical: a client and server establish a connection, then messages flow back and forth. The versions only optimize the plumbing — connection reuse, parallelism, compression, and which transport carries it all.
Anatomy of Messages & MIME Types
All HTTP communication is two structured text messages — a request and a response — that share one skeleton: a start line, a block of headers, a single blank line, then an optional body. That blank line is not decoration: it's the unambiguous signal that headers have ended and the body (if any) begins. Parsers rely on it absolutely.
The request message, line by line
POST /api/v1/notes HTTP/1.1 # request line: method · target · version
Host: api.example.com # mandatory in HTTP/1.1 (virtual hosting)
User-Agent: curl/8.4.0
Authorization: Bearer eyJhbGc... # headers: case-insensitive key: value
Content-Type: application/json # what the body IS
Content-Length: 33 # body size in bytes — tells server where it ends
# blank line: headers end, body begins
{"title":"buy milk","done":false} # request body
The request line has three parts: the method (the intent — §6), the request
target (usually a path plus optional query string, like /notes?done=false), and the
protocol version. The Host header is separate from the target because one server may
host many domains. Content-Length matters more than it looks: on a persistent connection
where many messages stream back-to-back, it tells the receiver exactly how many body bytes to read before the
next message begins — get it wrong and you desynchronize the whole connection.
The response message
HTTP/1.1 201 Created # status line: version · code · reason phrase
Content-Type: application/json
Content-Length: 41
Date: Sat, 30 May 2026 12:00:00 GMT
# blank line
{"id":42,"title":"buy milk","done":false}
The status line carries the version, the numeric status code the client branches on (§10), and a human-readable reason phrase ("Created", "Not Found"). The reason phrase is purely informational — clients must act on the number, never the text, since the text varies and is dropped entirely in HTTP/2.
MIME types — the Content-Type vocabulary
Since HTTP/1.0, the same protocol carries any kind of content — JSON, HTML, images, video, binary blobs. The
receiver needs to know which, and that's the job of the MIME type (media type) in the
Content-Type header. Its grammar is type/subtype with optional parameters:
application/json; charset=utf-8 or multipart/form-data; boundary=----X7. The
charset parameter tells the parser the text encoding; omitting it on text can cause mojibake.
Getting Content-Type wrong is a common bug: send JSON labeled text/plain and many
clients won't parse it; a server may reject a body whose declared type it doesn't accept with 415 Unsupported Media Type.
| MIME type | Used for |
|---|---|
| application/json | The default for modern APIs. |
| application/x-www-form-urlencoded | Classic HTML form posts, encoded as key=val&k2=v2. |
| multipart/form-data | File uploads and mixed binary fields (§17). |
| text/html, text/plain | Web pages and plain text. |
| application/octet-stream | Arbitrary "just bytes"; usually triggers a file download. |
| text/event-stream | Server-Sent Events streaming (§17). |
Headers go on the wire before the status line and body — the receiver reads them first to
know how to interpret what follows. So once you've written the status code (Go's
w.WriteHeader()) or started streaming the body, any further header you set is silently dropped:
the bytes are already gone. The invariant in every framework is set headers → write status → write
body. Forgetting it is the cause of the classic "my header isn't being sent" bug.
package main
import (
"encoding/json"
"io"
"net/http"
)
type Note struct {
ID int `json:"id"`
Title string `json:"title"`
Done bool `json:"done"`
}
func createNote(w http.ResponseWriter, r *http.Request) {
body, _ := io.ReadAll(r.Body) // 1. read request body
defer r.Body.Close()
var in Note
if err := json.Unmarshal(body, &in); err != nil {
http.Error(w, "invalid JSON", http.StatusBadRequest)
return
}
w.Header().Set("Content-Type", "application/json") // 2. headers FIRST
w.WriteHeader(http.StatusCreated) // 3. status
in.ID = 42
json.NewEncoder(w).Encode(in) // 4. body LAST
}
func main() {
http.HandleFunc("POST /api/v1/notes", createNote) // Go 1.22 method routing
http.ListenAndServe(":8080", nil)
}
from fastapi import FastAPI, Response, status
from pydantic import BaseModel
app = FastAPI()
class Note(BaseModel):
title: str
done: bool = False
@app.post("/api/v1/notes", status_code=status.HTTP_201_CREATED)
def create_note(note: Note, response: Response):
# FastAPI parses + validates the JSON body; a malformed body -> 422 automatically.
response.headers["Content-Type"] = "application/json"
return {"id": 42, **note.model_dump()}
# run: uvicorn main:app --port 8080
Why Headers Exist
Headers are key–value pairs of metadata attached to a request or response, sitting between the start line and the body. The natural question is: why a separate section at all? Why not put this information in the URL, or inside the body?
Separating metadata from payload buys two foundational properties:
- Extensibility — new capabilities are added simply by defining new headers, with zero
change to the protocol's structure. Security policies, client hints, compression negotiation, custom
X-headers for your own app, tracing IDs — all of these bolted onto HTTP over the years without breaking older clients, because anything that doesn't recognize a header just ignores it. The protocol grows without versioning churn. - Remote control — headers let the client send instructions and preferences that
steer the server's behavior, and let the server send directives back. The client says "I'd prefer JSON"
(
Accept), "cache this for an hour" is what the server replies (Cache-Control), "here's who I am" (Authorization), "I'm talking to this domain" (Host). The same request body can produce different responses depending purely on its headers — they are the dials and switches of the exchange.
Headers are also case-insensitive in their names and ordering-independent, which is why frameworks normalize them. And because intermediaries read them, headers are where caching, content negotiation, authentication, and security policy are all expressed — the chapters that follow are, in large part, a tour of specific headers and the machinery behind them.
Types of HTTP Headers
The hundreds of defined headers fall into a few functional buckets. Knowing the bucket tells you who sets a header and what it governs.
| Category | Purpose | Examples |
|---|---|---|
| Request | Sent by the client; describe the client, its capabilities, and what it wants | User-Agent, Authorization, Accept, Host,
Cookie |
| General | Apply to both requests and responses; metadata about the message itself | Date, Connection, Cache-Control |
| Representation | Describe the body so the other side can interpret and process it | Content-Type, Content-Length, Content-Encoding,
ETag |
| Security | Sent by the server to control client (browser) behavior and block attack classes | Strict-Transport-Security, Content-Security-Policy,
X-Frame-Options, Set-Cookie |
Security headers — the attack each one stops
Security headers are worth understanding individually, because each is a direct countermeasure to a specific, named attack. They work by instructing the browser to refuse dangerous behavior:
Strict-Transport-Security(HSTS) — tells the browser "for the next N seconds, only ever contact me over HTTPS, even if a link says http://". This defeats protocol-downgrade attacks, where an attacker on the network strips the connection back to plaintext HTTP to read or alter it.Content-Security-Policy(CSP) — an allow-list of where scripts, styles, images, and other resources may load from (e.g.default-src 'self'). Its main job is to neutralize cross-site scripting (XSS): even if an attacker injects a<script>tag, the browser refuses to execute it unless its source is on the allow-list.X-Frame-Options: DENY— forbids your page from being embedded in an<iframe>on another site. This blocks clickjacking, where an attacker overlays your real (invisible) page on top of bait, tricking users into clicking buttons they can't see.X-Content-Type-Options: nosniff— stops the browser from guessing ("sniffing") a resource's type when the declaredContent-Typelooks wrong. Sniffing can be tricked into executing an uploaded image as a script — the classic MIME-sniffing attack — and this header shuts it off.Set-CookiewithHttpOnlyandSecure—HttpOnlyhides the cookie from JavaScript (so a successful XSS can't steal the session), andSecureensures it's only ever sent over HTTPS (so it can't leak over plaintext).
Because these should apply to every response, you set them once in middleware rather than per-handler — a single chokepoint that wraps the whole application:
func securityHeaders(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
h := w.Header()
h.Set("Strict-Transport-Security", "max-age=31536000; includeSubDomains")
h.Set("Content-Security-Policy", "default-src 'self'")
h.Set("X-Frame-Options", "DENY")
h.Set("X-Content-Type-Options", "nosniff")
next.ServeHTTP(w, r) // headers MUST be set before this call (see §3)
})
}
// http.ListenAndServe(":8080", securityHeaders(mux))
from starlette.middleware.base import BaseHTTPMiddleware
class SecurityHeaders(BaseHTTPMiddleware):
async def dispatch(self, request, call_next):
resp = await call_next(request)
resp.headers["Strict-Transport-Security"] = "max-age=31536000; includeSubDomains"
resp.headers["Content-Security-Policy"] = "default-src 'self'"
resp.headers["X-Frame-Options"] = "DENY"
resp.headers["X-Content-Type-Options"] = "nosniff"
return resp
app.add_middleware(SecurityHeaders)
HTTP Methods — The Deep Dive
Methods (a.k.a. verbs) express the semantic intent of a request — they tell the server what kind of action you want, independent of the URL. RFC 9110 defines nine. Each carries three independent properties, and getting them right is what separates a correct, retry-safe, cache-friendly API from a buggy one.
The three properties, precisely
- Safe — the request is read-only; issuing it must not change observable server state. The point is that automated agents — search crawlers, link prefetchers, antivirus URL scanners — feel free to fire safe requests at will. If a GET deleted something, a crawler would wreck your data. (Logging the request or incrementing a view counter is tolerated; the resource the client is acting on must not change.)
- Idempotent — the effect of N identical requests equals the effect of one. This is the property that makes a request safe to retry after a network failure: if your client times out unsure whether the request landed, it can re-send without fear of doubling the effect. Crucially, idempotency is about server state, not the response body — two DELETEs may return 200 then 404, but the resulting state ("the resource is gone") is identical, so DELETE is idempotent.
- Cacheable — the response may be stored by a cache (browser, CDN, proxy) and reused to answer a later equivalent request without contacting the origin server, saving a round trip and load.
Every safe method is automatically idempotent — if a request changes nothing, doing it twice still changes nothing. The converse is false: PUT and DELETE are idempotent but not safe, because they do change state — just deterministically, to the same end result no matter how many times you repeat them.
The authoritative property table (RFC 9110)
| Method | Safe | Idempotent | Cacheable | Request body | Response body |
|---|---|---|---|---|---|
| GET | Yes | Yes | Yes | No | Yes |
| HEAD | Yes | Yes | Yes | No | No |
| OPTIONS | Yes | Yes | No | No | Yes |
| TRACE | Yes | Yes | No | No | Yes |
| PUT | No | Yes | No | Yes | Optional |
| DELETE | No | Yes | No | Optional | Optional |
| POST | No | No | Conditional* | Yes | Yes |
| PATCH | No | No | Conditional* | Yes | Optional |
| CONNECT | No | No | No | No | Yes |
*POST and PATCH are cacheable only when the response
explicitly includes freshness info and a matching Content-Location — rare in practice, so treat
them as effectively non-cacheable.
"Give me this resource." The workhorse of the read path.
GET should only retrieve data and carries no semantic meaning in a body — most servers,
proxies, and caches ignore or reject a GET body, so any input must go in the URL. Filtering, sorting, and
pagination therefore live in the query string
(/notes?done=false&sort=created&limit=20), never in a body. Because GET is safe and
cacheable, its responses can be stored at every layer — the browser cache, a CDN, a corporate proxy —
which is the engine of web performance. It is also exactly why a state change must never hide
behind a GET: a prefetcher, crawler, or cache warm-up will fire it and trigger the change without a user
ever clicking.
Backend behavior
- Found → 200 OK with the resource in the body.
- Missing → 404 Not Found.
- Client's cached copy is still valid → 304 Not Modified, empty body (§12).
"Give me the metadata you'd send for a GET, but skip the payload."
The server returns the exact same status and headers it would for a GET — including
Content-Length, Content-Type, ETag, and Last-Modified
— but omits the body entirely. This is useful when you want the metadata without paying for the download:
checking whether a resource exists (200 vs 404), reading a file's size before deciding to
fetch it, validating a cached copy's freshness, or running a cheap health check against a URL. Most
frameworks implement HEAD automatically by routing it through the GET handler and discarding the generated
body, which is why you rarely write a separate HEAD handler.
"Here's some data — do something with it that changes state."
POST is the general-purpose state-changing verb: creating records, submitting forms, kicking off background jobs, processing payments, anything that doesn't fit the more specific verbs. Its defining (and dangerous) trait is that it is not idempotent — two identical POSTs create two resources. This is the mechanism behind duplicate orders when a user double-clicks "buy," and behind double-charges when a flaky network makes a client retry a payment it isn't sure went through. The fix is an idempotency key (§7), which is essentially how you bolt idempotency onto a method that lacks it.
Backend behavior
- Created a resource → 201 Created, ideally with a
Locationheader pointing to the new URL so the client knows where it now lives. - Accepted for asynchronous processing (not finished yet) → 202 Accepted.
- Validation failed → 400 (the request was malformed) or 422 (well-formed but semantically invalid).
- Business-rule clash, e.g. a duplicate username → 409 Conflict.
"Make the resource at this URL look exactly like this body."
PUT performs a complete replacement: the client supplies the full new representation of
the resource and addresses a specific, known URL (usually one the client itself chose, like
/users/alice/avatar). This is what makes it idempotent — sending the same full body ten times
leaves the resource in the identical final state every time, because each request overwrites rather than
accumulates. PUT can also create the resource if it doesn't yet exist (then return 201
Created) or replace an existing one (then 200 OK or 204 No Content).
Use POST when the server decides the new resource's identity:
POST /notes and the server replies Location: /notes/42. Use
PUT when the client already knows the full target URL:
PUT /users/alice/settings. The practical consequence: PUT-to-create is idempotent
(retry-safe), POST-to-create is not (retry may duplicate). Choosing the right one is choosing whether
retries are safe.
"Change just these fields; leave everything else alone."
Where PUT replaces the whole resource, PATCH sends only a set of changes — e.g.
{"done": true} to flip a single field without re-sending the entire object. This is more
efficient and avoids accidentally wiping fields you didn't mean to touch. PATCH is not guaranteed
idempotent by the spec (the † in the table): a patch like "append this item to the list" or "increment the
counter by 1" produces a different result each time it runs. A patch that sets fields to fixed
values usually is idempotent in practice, but you can't rely on it universally. Servers advertise which
patch document formats they accept — such as JSON Merge Patch or the more surgical JSON Patch — via the
Accept-Patch response header.
PATCH for partial updates, PUT for full replacement. Many developers reach for PUT when they actually mean PATCH (and then have to send every field to avoid nulling them out). Default to PATCH unless you genuinely intend to overwrite the entire resource.
"Delete this resource."
DELETE is idempotent for a subtle but important reason: deleting once removes the resource, and deleting again changes nothing because it's already gone. The responses may differ across attempts (the first returns 200/204, a repeat returns 404), but the state of the server — "this resource does not exist" — is identical after one call or ten. That state-equivalence is exactly what idempotency means, which is why a client can safely retry a DELETE that timed out. Return 204 No Content when there's nothing to send back, or 200 with a body describing what was removed.
"What can I do with this resource?"
OPTIONS asks the server which methods and headers a resource supports without acting on it; the server
answers with an Allow header (e.g. Allow: GET, POST, OPTIONS). Its starring
real-world role is the CORS preflight (§9): before a browser sends a "non-simple"
cross-origin request, it automatically fires an OPTIONS to ask permission, and only proceeds if the
server's response approves the method, headers, and origin. You almost never call OPTIONS by hand, but
you'll see it constantly in the browser's Network tab paired with your real requests.
"Open a raw two-way tunnel to this host through you, proxy."
CONNECT is used almost exclusively by forward proxies handling HTTPS. When a browser
configured to use a proxy wants to reach an HTTPS site, it sends CONNECT example.com:443 to
the proxy, which opens a TCP tunnel to the destination and then blindly relays the encrypted bytes in both
directions. Because the traffic inside is TLS-encrypted end-to-end, the proxy can't read or modify it — it
just shovels bytes. You will essentially never implement CONNECT in an application server; it lives in
proxy and infrastructure software (§18).
"Echo my request back so I can see what intermediaries changed."
TRACE asks the server to echo the request it received, so the client can see exactly how proxies and gateways along the path modified the message — a debugging tool for the request chain. In practice it is disabled almost everywhere, because it enabled a class of attack called Cross-Site Tracing (XST) that could expose otherwise-hidden headers (like cookies) to malicious scripts. Most servers and frameworks reject TRACE outright. Know it exists for completeness; you won't use it.
Wiring methods to handlers
mux := http.NewServeMux()
// Go 1.22+ binds method + path. An unmatched method auto-returns 405 Method Not Allowed.
mux.HandleFunc("GET /notes", listNotes) // safe, cacheable read
mux.HandleFunc("GET /notes/{id}", getNote)
mux.HandleFunc("HEAD /notes/{id}", getNote) // reuse GET; framework drops the body
mux.HandleFunc("POST /notes", createNote) // create (server assigns id)
mux.HandleFunc("PUT /notes/{id}", putNote) // full replace (idempotent)
mux.HandleFunc("PATCH /notes/{id}", patchNote) // partial update
mux.HandleFunc("DELETE /notes/{id}", deleteNote) // remove (idempotent)
func getNote(w http.ResponseWriter, r *http.Request) {
id := r.PathValue("id") // built-in path params
note, err := db.Find(id)
if err == ErrNotFound { http.NotFound(w, r); return } // 404
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(note) // 200
}
from fastapi import FastAPI, HTTPException
app = FastAPI() # auto-returns 405 for unregistered methods, 422 on bad bodies
@app.get("/notes") # safe, cacheable read (list)
def list_notes(done: bool | None = None, limit: int = 20): ...
@app.get("/notes/{id}") # GET also serves HEAD automatically
def get_note(id: int):
note = db.find(id)
if note is None:
raise HTTPException(404, "not found") # 404
return note # 200
@app.post("/notes", status_code=201) # create (server assigns id)
def create_note(note: Note): ...
@app.put("/notes/{id}") # full replace (idempotent)
def put_note(id: int, note: Note): ...
@app.patch("/notes/{id}") # partial update
def patch_note(id: int, patch: dict): ...
@app.delete("/notes/{id}", status_code=204) # remove (idempotent)
def delete_note(id: int): ...
Idempotency in Practice
Idempotency is the property that makes a request safe to retry — and retries are not optional in a distributed system, they're inevitable. Networks drop packets, a client times out without knowing whether the request reached the server, a user impatiently double-clicks, a mobile app re-sends when the radio reconnects. GET, PUT, and DELETE are idempotent by design, so retrying them is harmless. POST is not, which is the root cause of duplicate-record and double-charge bugs.
The deeper issue is the "unknown outcome" problem: when a client sends a POST and the connection drops before the response arrives, the client genuinely cannot tell whether the server processed it. Retrying risks a duplicate; not retrying risks losing the operation. Idempotency keys resolve this dilemma cleanly.
The client generates a unique key (typically a UUID) for the logical operation and sends it as an
Idempotency-Key header. The server stores the key alongside the result of the first successful
execution. If the same key ever arrives again — because the client retried — the server recognizes it and
returns the stored result without re-running the work. The operation thus executes at
most once regardless of how many times it's submitted. (MDN now documents
Idempotency-Key as a standardizing header, and payment APIs like Stripe popularized the
pattern.)
POST /payments with Idempotency-Key: 9f8c-….key → result,
return 201.Implementation considerations
- Store the key durably — the in-memory maps below are for illustration; production uses Redis or a database table so the key survives restarts and is shared across all server instances (remember, any node may receive the retry — §1).
- Give keys a TTL — expire stored keys after, say, 24 hours so the store doesn't grow without bound.
- Persist before responding — record the result before sending it, so a crash between "did the work" and "saved the key" doesn't lose the dedup guarantee.
var seen sync.Map // key -> []byte result. Use Redis with a TTL in production.
func createPayment(w http.ResponseWriter, r *http.Request) {
key := r.Header.Get("Idempotency-Key")
if key == "" {
http.Error(w, "missing Idempotency-Key", http.StatusBadRequest)
return
}
if cached, ok := seen.Load(key); ok { // replay: return the stored result
w.WriteHeader(http.StatusOK)
w.Write(cached.([]byte))
return
}
result := charge(r) // the real, non-idempotent work
seen.Store(key, result) // remember it BEFORE responding
w.WriteHeader(http.StatusCreated)
w.Write(result)
}
from fastapi import Header, HTTPException
seen: dict[str, dict] = {} # use Redis with a TTL in production
@app.post("/payments", status_code=201)
def create_payment(payload: dict, idempotency_key: str = Header(...)):
if idempotency_key in seen: # replay: return the stored result
return seen[idempotency_key]
result = charge(payload) # the real, non-idempotent work
seen[idempotency_key] = result # remember it BEFORE responding
return result
Idempotency keys protect against duplicate execution. A different but related hazard — two clients
overwriting each other's edits (the "lost update") — needs a complementary tool: optimistic concurrency with
If-Match and ETags, covered in §13.
Cookies, Sessions & Authentication
HTTP is stateless (§1), yet every real application needs to know "this request is from the same logged-in user as the last one." The protocol provides no memory, so we rebuild identity on every request by having the client re-send a small credential. There are two dominant architectures for this, but both start from the same primitive: the cookie.
How cookies work mechanically
The entire cookie mechanism is two headers. The server attaches a Set-Cookie header to a
response; the browser stores the value and then automatically includes it in a
Cookie header on every subsequent request to that domain — without any code on the page asking it
to. That automaticity is both the convenience (sessions "just work") and the danger (it's what makes CSRF
possible — see SameSite below).
# response from the server — sets the cookie once
Set-Cookie: session=abc123; HttpOnly; Secure; SameSite=Strict; Max-Age=3600; Path=/
# every later request from the browser — sent automatically
Cookie: session=abc123
The attributes, and why each matters for security
HttpOnly— the cookie is invisible to JavaScript (document.cookiecan't read it). This means even a successful XSS injection can't exfiltrate the session token.Secure— the cookie is only ever transmitted over HTTPS, so it can't leak across a plaintext connection an attacker is sniffing.SameSite— controls whether the cookie rides along on cross-site requests, the core defense against CSRF (cross-site request forgery).Strictnever sends it cross-site;Lax(the modern default) sends it on top-level navigations but not on cross-site sub-requests;Nonealways sends it but then requiresSecure.Max-Age/Expires— lifetime; without them the cookie is a "session cookie" that dies when the browser closes.Domain/Path— scope which hosts and URL paths receive it.
Two authentication architectures
The choice between them is essentially "where does the session state live?" — and that choice ripples into scalability and revocation.
| Session cookies (stateful) | Token / JWT (stateless) | |
|---|---|---|
| Where state lives | Server stores the full session; the cookie holds only an opaque ID | The client holds a signed token containing the claims; the server stores nothing |
| Sent via | Cookie header (automatic) |
Authorization: Bearer <token> (set explicitly by the client) |
| Verifying a request | Look the session ID up in the store on every request | Verify the cryptographic signature locally — no lookup |
| Horizontal scaling | Needs a shared session store (e.g. Redis) all nodes can reach | Trivial — any node can verify a signature with no shared state |
| Revocation | Easy — just delete the session server-side | Hard — a token is valid until it expires; you need a denylist or short lifetimes + refresh tokens |
A JWT is three base64url parts joined by dots: header.payload.signature. The
header names the algorithm, the payload holds the claims (user ID, expiry), and the signature is computed over
the first two with a secret or private key. Anyone can read a JWT (it's not encrypted, just encoded),
but only the holder of the key can forge a valid signature — which is what lets a server trust it
without a database lookup. The common refinement is a short-lived access token plus a long-lived refresh
token, mitigating the revocation weakness.
On the protocol side, the WWW-Authenticate response header (paired with a 401) tells the client which authentication scheme to use, and
Authorization is the request header that carries the credentials back.
// Set a secure session cookie after login (stateful)
func login(w http.ResponseWriter, r *http.Request) {
sid := newSessionID()
sessions.Store(sid, userID) // server-side session store (use Redis)
http.SetCookie(w, &http.Cookie{
Name: "session",
Value: sid,
HttpOnly: true, // JS can't read it
Secure: true, // HTTPS only
SameSite: http.SameSiteStrictMode, // CSRF defense
MaxAge: 3600,
Path: "/",
})
}
// Bearer-token auth middleware (stateless)
func requireToken(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
auth := r.Header.Get("Authorization")
if !strings.HasPrefix(auth, "Bearer ") || !validJWT(auth[7:]) {
w.Header().Set("WWW-Authenticate", `Bearer`)
http.Error(w, "unauthorized", http.StatusUnauthorized) // 401
return
}
next.ServeHTTP(w, r)
})
}
from fastapi import Response, HTTPException, Depends
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
# Set a secure session cookie after login (stateful)
@app.post("/login")
def login(response: Response):
sid = new_session_id()
sessions[sid] = user_id # server-side session store (use Redis)
response.set_cookie(
key="session", value=sid,
httponly=True, # JS can't read it
secure=True, # HTTPS only
samesite="strict", # CSRF defense
max_age=3600, path="/",
)
# Bearer-token auth dependency (stateless)
bearer = HTTPBearer()
def require_token(cred: HTTPAuthorizationCredentials = Depends(bearer)):
if not valid_jwt(cred.credentials):
raise HTTPException(401, "unauthorized",
headers={"WWW-Authenticate": "Bearer"})
@app.get("/me")
def me(_=Depends(require_token)):
return {"ok": True}
CORS & the Preflight Workflow
To understand CORS you first have to understand the rule it relaxes. Browsers enforce the Same-Origin
Policy: JavaScript on a page from one origin may not read responses from a different
origin. An "origin" is the triple of scheme + host + port — https://example.com
and https://example.com:8443 and http://example.com are all different origins. This
policy exists because browsers automatically attach your cookies to requests (§8); without it, a malicious
site you visit could silently fire authenticated requests at your bank's API and read the responses.
CORS (Cross-Origin Resource Sharing) is the controlled mechanism by which a server opts
in to letting specific other origins read its responses.
CORS is enforced entirely by the browser, not the server. The server only
advertises who it permits, via response headers; the browser is what inspects those headers and
decides whether to hand the response back to the page's JavaScript or block it. This is why CORS errors
appear in the browser console but the server log shows a perfectly normal 200 — the request
succeeded, the browser just refused to share the result. It's also why curl, mobile
apps, and server-to-server calls ignore CORS completely: there's no browser to enforce it.
Flow 1 — the simple request
For requests the spec deems "simple" — GET, POST, or HEAD using only standard headers and a common content
type — the browser doesn't bother asking permission in advance. It just sends the request (with an
Origin header) and checks the response afterward.
Origin: https://example.com and sends the request.
Access-Control-Allow-Origin echoing that origin (or *) in the response.Flow 2 — the preflighted request
For anything riskier, the browser sends a preliminary "may I?" request first. It triggers a preflight if any of these conditions holds:
- The method isn't GET / POST / HEAD (so PUT, PATCH, DELETE always preflight).
- The request carries a non-simple header — most commonly
Authorizationor any customX-header. - The
Content-Typeis outside the simple set — andapplication/jsonis outside it.
Since a typical authenticated JSON API request hits all three triggers (JSON body, Authorization
header, often a non-GET method), nearly every real API call is preflighted. That's why your
Network tab shows a paired OPTIONS request immediately before each actual call — it's not a
bug, it's the browser asking permission.
OPTIONS /resource with Access-Control-Request-Method
and -Request-Headers, asking "may I send this method with these headers from my origin?" — no
body.Access-Control-Allow-Origin / -Methods / -Headers, plus -Max-Age telling the
browser how long it may cache this approval (e.g. 24h) so it can skip the preflight on future identical
calls.One extra wrinkle for credentialed requests (those carrying cookies): the server must also
send Access-Control-Allow-Credentials: true, and in that mode it may not use the
* wildcard for the origin — it has to name the exact origin. This prevents a server from
accidentally exposing authenticated data to any site.
func cors(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
h := w.Header()
h.Set("Access-Control-Allow-Origin", "https://example.com") // exact, not * (credentials)
h.Set("Access-Control-Allow-Methods", "GET, POST, PUT, PATCH, DELETE")
h.Set("Access-Control-Allow-Headers", "Content-Type, Authorization")
h.Set("Access-Control-Allow-Credentials", "true")
h.Set("Access-Control-Max-Age", "86400") // cache the approval for 24h
if r.Method == http.MethodOptions {
w.WriteHeader(http.StatusNoContent) // 204: answer the preflight and stop
return
}
next.ServeHTTP(w, r)
})
}
from fastapi.middleware.cors import CORSMiddleware
# Handles the OPTIONS preflight and all the headers for you.
app.add_middleware(
CORSMiddleware,
allow_origins=["https://example.com"], # exact origin when credentials are on
allow_methods=["GET", "POST", "PUT", "PATCH", "DELETE"],
allow_headers=["Content-Type", "Authorization"],
allow_credentials=True,
max_age=86400, # cache the approval for 24h
)
Status Codes
A three-digit status code is a standardized, machine-readable verdict on a request. Its value is that the client can decide what to do — retry, redirect, re-authenticate, show an error — by reading one number, without parsing the body or guessing from its shape. Before status codes existed, clients had to inspect response content to infer success or failure, which was fragile and inconsistent. The leading digit defines the class, which alone tells you whose fault a problem is.
| Class | Meaning | Who's responsible |
|---|---|---|
| 1xx | Informational — received, continue | — |
| 2xx | Success | — |
| 3xx | Redirection — more action needed (§11) | — |
| 4xx | Client error — the request was wrong | The caller |
| 5xx | Server error — the server failed to fulfill a valid request | The server |
The codes you'll actually use, and exactly when
| Code | Name | When to send it |
|---|---|---|
| 100 | Continue | Client may send the body; used with Expect: 100-continue so a large upload isn't sent
until the server signals it's willing. |
| 101 | Switching Protocols | Upgrade accepted, e.g. to WebSocket (§19). |
| 200 | OK | Generic success; resource returned. |
| 201 | Created | POST/PUT created a resource; include a Location header pointing to it. |
| 202 | Accepted | Accepted for asynchronous processing; the work isn't done yet. |
| 204 | No Content | Success with nothing to return (DELETE, preflight OPTIONS). |
| 206 | Partial Content | A range request was satisfied (§15). |
| 301 | Moved Permanently | Resource has a new URL forever (§11). |
| 302 | Found | Temporary redirect (§11). |
| 304 | Not Modified | Cache validation — the client's copy is still good (§12). |
| 307 | Temporary Redirect | Like 302 but method & body are preserved (§11). |
| 308 | Permanent Redirect | Like 301 but method & body are preserved (§11). |
| 400 | Bad Request | Malformed syntax the server can't parse (broken JSON, bad encoding). |
| 401 | Unauthorized | Missing or invalid credentials → the client must authenticate. |
| 403 | Forbidden | Authenticated, but not permitted to do this. |
| 404 | Not Found | No such resource, or the URL is wrong. |
| 405 | Method Not Allowed | Valid resource, wrong method; reply with an Allow header listing valid ones. |
| 409 | Conflict | Request conflicts with current state (duplicate name, version mismatch). |
| 412 | Precondition Failed | An If-Match/If-Unmodified-Since precondition didn't hold (§13). |
| 413 | Content Too Large | Request body exceeds the server's limit. |
| 415 | Unsupported Media Type | Server doesn't accept the body's Content-Type. |
| 422 | Unprocessable Content | Syntax is fine but the data fails business validation. |
| 429 | Too Many Requests | Rate limit exceeded; include a Retry-After header. |
| 500 | Internal Server Error | An unhandled exception crashed the handler — the generic "we broke". |
| 501 | Not Implemented | The method/feature isn't supported yet. |
| 502 | Bad Gateway | A proxy received an invalid response from the upstream server. |
| 503 | Service Unavailable | Temporarily down/overloaded/maintenance; include Retry-After. |
| 504 | Gateway Timeout | A proxy waited for the upstream and gave up. |
401 Unauthorized actually means unauthenticated — "I don't know who you are; send credentials." 403 Forbidden means authenticated but unauthorized — "I know exactly who you are, and you're still not allowed." A logged-in user hitting an admin-only endpoint gets 403, not 401. Naming the right one lets the client react correctly: 401 → show the login screen; 403 → show "access denied."
400 is for requests the server can't even parse — broken JSON, malformed encoding.
422 is for requests that parse perfectly but are semantically invalid — a well-formed JSON
object whose email field contains a phone number. FastAPI returns 422 for its automatic
validation failures; many APIs use 400 for both, but the distinction helps clients tell "fix your syntax"
from "fix your data."
5xx means the server failed to honor a valid request — so a 500 in your logs is your bug, not the client's. But 502 / 503 / 504 are special: you rarely emit them from application code. They're generated by the proxy or load balancer in front of you (nginx, Envoy, a cloud LB) when it can't reach your app, finds it down, or waits too long. Seeing them points at infrastructure and upstream health, not your handler logic — a vital triage shortcut.
func getUser(w http.ResponseWriter, r *http.Request) {
if r.Header.Get("Authorization") == "" {
http.Error(w, "login required", http.StatusUnauthorized) // 401: who are you?
return
}
user, err := db.Find(r.PathValue("id"))
if err == ErrNotFound {
http.Error(w, "no such user", http.StatusNotFound) // 404
return
}
if !user.VisibleTo(r) {
http.Error(w, "forbidden", http.StatusForbidden) // 403: not allowed
return
}
json.NewEncoder(w).Encode(user) // 200
}
from fastapi import HTTPException, Header
@app.get("/users/{id}")
def get_user(id: int, authorization: str = Header(default="")):
if not authorization:
raise HTTPException(401, "login required") # 401: who are you?
user = db.find(id)
if user is None:
raise HTTPException(404, "no such user") # 404
if not user.visible_to(authorization):
raise HTTPException(403, "forbidden") # 403: not allowed
return user # 200
Redirections
A redirect lets one resource live at more than one URL: the server responds with a 3xx status
and a Location header naming where to go, and the client (a browser following automatically, or
your HTTP library) re-issues the request to that new URL. Redirects power URL migrations, canonical-URL
enforcement, HTTP→HTTPS upgrades, and the Post/Redirect/Get pattern. The detail that trips up backends is
whether the original method and body survive the redirect — because historically they often
didn't.
| Code | Permanent? | Method preserved? | Use case |
|---|---|---|---|
| 301 Moved Permanently | Yes | Not guaranteed (clients may turn POST→GET) | Resource moved forever — update bookmarks, transfer SEO ranking. |
| 302 Found | No | Not guaranteed (clients may turn POST→GET) | Temporary relocation; keep using the original URL. |
| 303 See Other | No | Forces a GET | Post/Redirect/Get — after a POST, redirect to a result page so a browser refresh won't re-submit the form. |
| 307 Temporary Redirect | No | Yes — method & body kept | Temporary redirect that must re-issue the same POST/PUT with its body intact. |
| 308 Permanent Redirect | Yes | Yes — method & body kept | Permanent move that must preserve a POST/PUT. |
Early clients, faced with a 301 or 302 on a POST, would silently re-issue it as a GET and drop the body — technically a spec violation, but so widespread it became expected behavior. That ambiguity ("will my POST survive this redirect?") is precisely why 307 and 308 were created: they guarantee the method and body are preserved. So if a redirected form submission mysteriously arrives with no data, you used 301/302 where you needed 308/307. Conversely, 303 deliberately forces a GET — which is exactly what you want for Post/Redirect/Get, so the user can safely refresh the result page.
// Permanent move that must keep the method/body -> 308
func oldRoute(w http.ResponseWriter, r *http.Request) {
http.Redirect(w, r, "/person/"+r.PathValue("id"), http.StatusPermanentRedirect) // 308
}
// Post/Redirect/Get -> 303 so a browser refresh won't re-POST the form
func submitForm(w http.ResponseWriter, r *http.Request) {
id := save(r)
http.Redirect(w, r, "/results/"+id, http.StatusSeeOther) // 303 (forces GET)
}
from fastapi.responses import RedirectResponse
# Permanent move that must keep the method/body -> 308
@app.api_route("/user/{id}", methods=["GET", "POST"])
def old_route(id: int):
return RedirectResponse(f"/person/{id}", status_code=308)
# Post/Redirect/Get -> 303 so a browser refresh won't re-POST the form
@app.post("/submit")
def submit_form():
rid = save()
return RedirectResponse(f"/results/{rid}", status_code=303) # forces GET
HTTP Caching
Caching is storing a copy of a response so a later request can be satisfied without re-fetching from the origin — cutting bandwidth, latency, and server load all at once. It's arguably the single biggest lever for web performance. HTTP caching operates on two distinct layers, and conflating them is a common source of confusion:
- Freshness — "Can I reuse my copy without even asking?" If a response is still fresh, the client serves it from cache with zero network calls.
- Validation — "My copy might be stale; has it actually changed?" This is a cheap round trip that returns either "no, reuse yours" or the new content.
Caches also live at multiple tiers: the private browser cache (just for one user), and
shared caches like CDNs and reverse proxies (serving many users). The Cache-Control directives
below decide which tiers may store a response.
Layer 1 — Freshness via Cache-Control
The server stamps a response with how long it stays usable, so the client can reuse it freely until then:
max-age=3600— fresh for one hour; reuse with no network until it expires.publicvsprivate— may a shared cache (CDN/proxy) store it, or only the user's own browser? Useprivatefor personalized or authenticated responses so a CDN doesn't serve one user's data to another.no-cache— counterintuitively, this means "you may store it, but you must revalidate with the origin before each use." It's "always check," not "never cache."no-store— the true "never cache anywhere" directive, for sensitive data.immutable— "this will never change during its freshness, so don't even revalidate." Perfect for content-hashed static assets (app.4f9a2c.js), where a change produces a new filename anyway.
Layer 2 — Validation via ETag & Last-Modified
When freshness lapses, the client doesn't blindly re-download — it revalidates. The original response carried
an ETag (a fingerprint/hash of the content) and/or a Last-Modified timestamp. The
client sends these back as conditional headers, and if nothing changed the server returns a tiny
304 with no body, saving the entire payload.
ETag: "abc" +
Cache-Control: max-age=10.If-None-Match: "abc" (and/or
If-Modified-Since).ETags come in two flavors: a strong ETag ("abc") guarantees byte-for-byte
identity, while a weak one (W/"abc") only promises semantic equivalence — useful
when, say, a timestamp in the response differs but the meaningful content doesn't.
func getResource(w http.ResponseWriter, r *http.Request) {
body := loadResource()
sum := sha256.Sum256(body)
etag := `"` + hex.EncodeToString(sum[:8]) + `"` // fingerprint of the content
if r.Header.Get("If-None-Match") == etag { // client already has this exact version
w.WriteHeader(http.StatusNotModified) // 304, no body — payload saved
return
}
w.Header().Set("ETag", etag)
w.Header().Set("Cache-Control", "max-age=10")
w.Write(body) // 200 + body
}
import hashlib
from fastapi import Request, Response
@app.get("/resource")
def get_resource(request: Request):
body = load_resource()
etag = '"' + hashlib.sha256(body).hexdigest()[:16] + '"' # fingerprint
if request.headers.get("if-none-match") == etag: # already cached
return Response(status_code=304) # 304, no body — payload saved
return Response(content=body, status_code=200,
headers={"ETag": etag, "Cache-Control": "max-age=10"})
Hand-managing ETags is error-prone — forget to bump one on a change and clients serve stale data
indefinitely. For static assets prefer immutable + content-hashed filenames; for API data,
client-side caches like React Query give finer control. And don't forget Vary: it tells caches
which request headers make the response differ (e.g. Vary: Accept-Encoding). Omit it
and a shared cache may serve a gzip-compressed body to a client that asked for none — or hand a French
response to an English speaker.
Conditional Requests & Optimistic Locking
The same validators that power caching (§12) solve a second, deeper backend problem: the lost update. Picture two editors who both open document v1. Alice saves her changes; Bob, still working from v1, saves a moment later — and Bob's save silently overwrites Alice's, with no error and no warning. Her work simply vanishes. Conditional requests prevent this through optimistic concurrency control: instead of locking the resource while someone edits (pessimistic), you let everyone edit freely and only detect a clash at save time.
The conditional headers
Each says "proceed only if a condition about the resource's current version holds." The naming is symmetric:
If-None-Match for reads (proceed if it changed → I need the new copy), If-Match for
writes (proceed if it's unchanged → safe to overwrite).
| Header | Proceed only if… | Typical use |
|---|---|---|
| If-None-Match | …the ETag does not match (the resource changed) | Caching on GET → returns 304 when unchanged |
| If-Match | …the ETag does match (unchanged since I read it) | Safe writes on PUT/PATCH/DELETE → 412 on conflict |
| If-Modified-Since | …it changed after this timestamp | Caching fallback when there's no ETag |
| If-Unmodified-Since | …it has not changed since this timestamp | Safe writes, timestamp-based |
| If-Range | …the resource is unchanged (else send the whole thing) | Resumable downloads (§15) |
The optimistic-locking flow
/doc/7 → receives the body plus ETag: "v1".PUT /doc/7 with If-Match: "v1" — "only
save if it's still v1, i.e. nobody changed it under me."ETag: "v2".A pessimistic lock would hold a row locked the entire time a user stares at an edit form — potentially minutes — blocking everyone else and risking deadlocks if the user wanders off. Optimistic concurrency holds nothing: it lets all readers proceed and detects conflict only at the instant of writing, cheaply, by comparing a version tag. The cost is that an occasional save fails and must be retried, which is almost always a better trade. This is how collaborative editors, wikis, and well-designed REST APIs prevent silent overwrites.
func updateDoc(w http.ResponseWriter, r *http.Request) {
doc := db.Get(r.PathValue("id"))
want := r.Header.Get("If-Match") // the ETag the client last saw
if want == "" {
http.Error(w, "If-Match required", http.StatusBadRequest)
return
}
if want != doc.ETag { // someone changed it first -> conflict
http.Error(w, "version conflict", http.StatusPreconditionFailed) // 412
return
}
doc.Apply(r.Body)
doc.ETag = newETag() // bump the version
db.Save(doc)
w.Header().Set("ETag", doc.ETag)
w.WriteHeader(http.StatusOK)
}
from fastapi import Request, Response, HTTPException, Header
@app.put("/doc/{id}")
def update_doc(id: int, request: Request, if_match: str = Header(default="")):
doc = db.get(id)
if not if_match:
raise HTTPException(400, "If-Match required")
if if_match != doc.etag: # someone changed it first -> conflict
raise HTTPException(412, "version conflict") # 412
doc.apply(request)
doc.etag = new_etag() # bump the version
db.save(doc)
return Response(status_code=200, headers={"ETag": doc.etag})
Content Negotiation & Compression
One URL can have multiple representations — the same article as HTML or JSON, in English or
Spanish, compressed or not. Content negotiation is the mechanism by which client and server agree on the best
one. The dominant form is server-driven (proactive) negotiation: the client lists its
preferences in Accept* request headers, and the server picks the best match it can produce,
signalling its choice in the matching Content-* response header. (There's also
agent-driven negotiation, where the server returns a list of choices for the client to pick from, but
it's rarely used.)
Accept* → server match →
Content-* — drives format, language, and compression. Only the header name changes.
| Client says (request) | Negotiates | Server answers (response) |
|---|---|---|
Accept: application/json |
Media type / format | Content-Type |
Accept-Language: es |
Natural language | Content-Language |
Accept-Encoding: gzip, br |
Compression | Content-Encoding |
Preferences can be weighted with quality values from 0 to 1:
Accept: application/json;q=1.0, text/html;q=0.8, */*;q=0.1 reads as "strongly prefer JSON, accept
HTML, and as a last resort anything." The server honors the highest-weighted type it can produce. If it can
satisfy none of the acceptable types, the correct response is 406 Not
Acceptable rather than silently returning the wrong format.
Compression — the same negotiation, applied to size
Compression is just encoding negotiation. When the client advertises Accept-Encoding: gzip, the
server may compress a text response before sending — and the savings on JSON, HTML, and JS are dramatic, since
they're highly repetitive: a ~26 MB JSON payload can shrink to ~3.8 MB. The server sets
Content-Encoding: gzip, the browser transparently decompresses, and the application code on both
ends is none the wiser. Two rules matter: only compress when the client said it can decode (else you send
garbage), and always pair with Vary: Accept-Encoding so a shared cache doesn't
serve the compressed body to a client that can't handle it (§12).
func handler(w http.ResponseWriter, r *http.Request) {
body, _ := json.Marshal(bigPayload)
w.Header().Set("Content-Type", "application/json")
w.Header().Set("Vary", "Accept-Encoding") // tell caches the body varies by encoding
if strings.Contains(r.Header.Get("Accept-Encoding"), "gzip") {
w.Header().Set("Content-Encoding", "gzip")
gz := gzip.NewWriter(w)
defer gz.Close()
gz.Write(body)
return
}
w.Write(body) // uncompressed fallback for clients that can't decode gzip
}
from fastapi.middleware.gzip import GZipMiddleware
from fastapi import Header
# gzip responses for capable clients, but only above a size threshold
# (compressing tiny payloads costs more CPU than it saves bytes).
app.add_middleware(GZipMiddleware, minimum_size=1000)
@app.get("/greeting")
def greeting(accept_language: str = Header(default="en")):
lang = "es" if accept_language.startswith("es") else "en"
return {"en": "Hello", "es": "Hola"}[lang]
Range Requests
A range request asks for only a slice of a resource rather than the whole thing. This solves three real problems: resuming a download that failed at 80% without restarting from zero, seeking in audio/video (jumping to the 3-minute mark fetches only the bytes around it), and parallelizing a large download into chunks fetched simultaneously. Without ranges, every interruption means starting over and every seek means downloading everything before the target.
The exchange, step by step
- The server signals it supports ranges by including
Accept-Ranges: byteson a normal response. - The client requests a slice with the
Rangeheader:Range: bytes=0-1023(the first kilobyte), orbytes=1024-(from byte 1024 to the end — exactly what a resumed download sends). - The server replies 206 Partial Content with a
Content-Range: bytes 1024-50000/50001header (which slice, of what total) and only those bytes in the body. - If the requested range is invalid (e.g. starts past the end of the file), the server returns 416 Range Not Satisfiable.
- For a resume, the file might have changed since the partial download began, so the client adds
If-Range: "<etag>": the server sends the partial only if the resource is unchanged, otherwise it returns the full 200 so the client restarts cleanly rather than stitching together incompatible halves.
# client resumes a download
GET /big.zip HTTP/1.1
Range: bytes=1024-
# server returns only the remaining slice
HTTP/1.1 206 Partial Content
Accept-Ranges: bytes
Content-Range: bytes 1024-50000/50001
Content-Length: 48977
The good news for implementers: the standard file-serving helpers in both ecosystems handle all of this —
range parsing, 206, 416, and If-Range — automatically. You rarely hand-roll it.
// http.ServeContent handles Range, 206, 416 and If-Range for you,
// driven by the file's modtime and an optional ETag.
func download(w http.ResponseWriter, r *http.Request) {
f, _ := os.Open("big.zip")
defer f.Close()
info, _ := f.Stat()
http.ServeContent(w, r, "big.zip", info.ModTime(), f)
}
# Starlette's FileResponse honors Range requests and emits 206 automatically.
from fastapi.responses import FileResponse
@app.get("/big.zip")
def download():
return FileResponse("big.zip", media_type="application/zip")
Persistent Connections & Keep-Alive
Establishing a TCP connection isn't free — it costs a full round trip for the handshake (and more if TLS is involved). HTTP/1.0 paid that toll on every single request, opening and tearing down a connection per resource, which made loading a page of dozens of assets painfully slow. HTTP/1.1 fixed this with persistent connections: one TCP connection is held open and reused for many request/response cycles, so the handshake cost is paid once and amortized across everything that follows.
The mechanics and the knobs
- In HTTP/1.1, connections are persistent by default — you get the benefit with zero configuration.
Connection: keep-aliveexplicitly requests the connection stay open, and theKeep-Aliveheader can attach parameters: a timeout (how long to hold the connection idle before closing) and a max (how many requests to serve before closing it).Connection: closeasks to close the connection after this response — the HTTP/1.0 default, still settable in 1.1 when you want a one-shot exchange.- Pipelining — sending several requests back-to-back without waiting for each response — was specified in 1.1, but it suffered head-of-line blocking (a slow first response stalls the rest on that connection) and was so buggy across implementations that it's effectively dead. HTTP/2 multiplexing (§2) is its proper replacement.
You rarely set these headers by hand — the defaults are right for almost everything. Persistent connections become a real concern when you're tuning server timeouts: hold connections too long and you leak idle ones and exhaust the connection pool under load; close them too eagerly and you lose the latency benefit. The timeouts below also double as a defense — a too-generous read timeout lets a Slowloris attacker tie up connections by sending headers one byte at a time.
srv := &http.Server{
Addr: ":8080",
Handler: mux,
ReadTimeout: 5 * time.Second,
WriteTimeout: 10 * time.Second,
IdleTimeout: 60 * time.Second, // how long to hold an idle keep-alive conn
ReadHeaderTimeout: 2 * time.Second, // mitigates Slowloris (slow-header attacks)
}
srv.ListenAndServe()
// srv.SetKeepAlivesEnabled(false) // force Connection: close if ever needed
# Keep-alive is owned by the ASGI server (uvicorn), not your app code:
# uvicorn main:app --timeout-keep-alive 60 --workers 4
import uvicorn
uvicorn.run("main:app", host="0.0.0.0", port=8080,
timeout_keep_alive=60, workers=4)
Multipart Uploads & Streamed Responses
JSON is a poor fit for binary data — you'd have to base64-encode it, inflating size by ~33% and forcing the whole thing into memory. Large transfers therefore use purpose-built mechanisms, and which one you reach for depends on the direction of the large data.
Large uploads (client → server) → multipart/form-data
The client splits the request body into discrete parts — each can be a form field or a file
— separated by a unique delimiter string declared in the boundary parameter of the
Content-Type. The boundary appears before each part and once more, with a trailing
--, to mark the very end. This lets the server stream-parse the body, peeling off each part (with
its own Content-Disposition naming it and optional per-part Content-Type) without
loading everything into memory at once. The unique boundary is what makes it safe to embed arbitrary binary
bytes — the parser scans for the delimiter rather than relying on length alone.
POST /upload HTTP/1.1
Content-Type: multipart/form-data; boundary=----X7gQ2
------X7gQ2
Content-Disposition: form-data; name="file"; filename="cat.png"
Content-Type: image/png
(binary image bytes)
------X7gQ2-- # closing boundary marks the end
Large downloads (server → client) → streaming
To send a large or open-ended response without buffering it all in memory (or timing out while you build it),
the server streams it in pieces. Two related techniques:
Transfer-Encoding: chunked sends the body as a series of self-describing chunks when the total
size isn't known up front (so there's no Content-Length); and
Content-Type: text/event-stream implements Server-Sent Events, a long-lived
response over which the server pushes labelled data: events as they occur. The connection is held
open with keep-alive, and the client appends each chunk as it arrives — ideal for progress updates, log tails,
or token-by-token LLM output.
// 1. Receive a multipart upload
func upload(w http.ResponseWriter, r *http.Request) {
r.ParseMultipartForm(32 << 20) // up to 32 MB in memory, overflow spills to disk
file, header, err := r.FormFile("file")
if err != nil {
http.Error(w, "no file", http.StatusBadRequest)
return
}
defer file.Close()
fmt.Fprintf(w, "received %s", header.Filename)
}
// 2. Stream a response in chunks (Server-Sent Events)
func stream(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "text/event-stream")
w.Header().Set("Connection", "keep-alive")
flusher := w.(http.Flusher) // Flush() pushes each chunk to the client immediately
for i := 0; i < 5; i++ {
fmt.Fprintf(w, "data: chunk %d\n\n", i)
flusher.Flush()
time.Sleep(time.Second)
}
}
import asyncio
from fastapi import UploadFile
from fastapi.responses import StreamingResponse
# 1. Receive a multipart upload
@app.post("/upload")
async def upload(file: UploadFile):
data = await file.read() # UploadFile streams to a temp file under the hood
return {"received": file.filename, "size": len(data)}
# 2. Stream a response in chunks (Server-Sent Events)
@app.get("/stream")
async def stream():
async def gen():
for i in range(5):
yield f"data: chunk {i}\n\n"
await asyncio.sleep(1)
return StreamingResponse(gen(), media_type="text/event-stream")
Sending big data up → multipart/form-data. Receiving big data
down → chunked transfer or text/event-stream with keep-alive. Partial
/ resumable transfers → range requests (§15). Two-way realtime →
WebSockets (§19).
Proxies, Tunneling & Forwarded Headers
In production your application almost never faces the internet directly. It sits behind a reverse proxy or load balancer — nginx, Envoy, HAProxy, a cloud load balancer — that terminates TLS, distributes traffic across instances, caches, and routes. This is excellent for operations but it quietly changes what your handler perceives about the request, and not accounting for that causes a recurring family of bugs.
X-Forwarded-* headers below exist, and why trusting them
blindly is dangerous.Forward vs reverse proxy — opposite ends
- A forward proxy sits in front of clients and makes requests on their behalf —
think corporate egress gateways, content filters, or anonymizers. The destination server sees the proxy, not
the user. HTTPS through a forward proxy uses the
CONNECTmethod (§6) to open an opaque encrypted tunnel. - A reverse proxy sits in front of servers; clients believe they're talking directly to it. It fronts your backend, handling TLS termination, load balancing, caching, compression, and path-based routing. This is the one your code lives behind, and a CDN is essentially a globally distributed reverse-proxy cache.
The consequence: you lose the real client
Once a reverse proxy forwards a request to your app, the TCP connection your handler sees originates from the
proxy's IP and (typically) over plain HTTP, because TLS was terminated at the edge. So
r.RemoteAddr is the proxy, and the scheme looks like http even though the user is on https. To
preserve the originals, proxies inject forwarding headers:
| Header | Carries |
|---|---|
| X-Forwarded-For | The original client IP, plus the chain of proxies it passed through. |
| X-Forwarded-Proto | The original scheme — https — even though the proxy→app hop is plain http. |
| X-Forwarded-Host | The Host the client originally requested. |
| Forwarded | The standardized single header (RFC 7239) combining the above. |
| Via | The proxies/gateways a message traversed, for loop detection and tracing. |
Spoofing: X-Forwarded-For is just a header a client can set, so if you use it
for rate-limiting, geolocation, or audit logs, you must only trust it when the request arrived from a
known, trusted proxy IP — otherwise an attacker forges any IP they like. Wrong-scheme
URLs: if you build absolute or redirect URLs from the local connection's scheme (which is http
behind the proxy) instead of honoring X-Forwarded-Proto, you'll generate http://
links on an HTTPS site — causing mixed-content warnings and redirect loops. Most frameworks ship "trusted
proxy" middleware that rewrites RemoteAddr and scheme correctly from these headers; enable it
rather than reading the headers raw.
Protocol Upgrade & WebSockets
HTTP's request/response, client-initiated shape is a poor fit for workloads that need the server to push data the moment something happens — live notifications, chat, multiplayer game state, collaborative cursors, trading tickers. Historically these were faked with polling (ask repeatedly) or long-polling (hold a request open). HTTP/1.1's Upgrade mechanism offers a clean exit: an established HTTP connection can switch protocols mid-stream, most famously into a WebSocket.
The WebSocket handshake
What's elegant is that it starts as ordinary HTTP, so it traverses the same ports and proxies, then upgrades:
- The client sends a normal GET carrying
Upgrade: websocket,Connection: Upgrade, a randomSec-WebSocket-Key, and a version. - The server agrees with 101 Switching Protocols and a
Sec-WebSocket-Acceptcomputed from the client's key (proving it understood the handshake, not just echoed it). - From that point the TCP connection is no longer HTTP — it's a persistent, full-duplex WebSocket carrying lightweight message frames. Either side can send at any time, with no per-message HTTP overhead and no polling.
GET /ws HTTP/1.1
Host: example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Don't reach for WebSockets reflexively. For one-way server→client streams — notifications, progress bars, log tails, streamed LLM tokens — Server-Sent Events (§17) are simpler: they're plain HTTP, auto-reconnect, and need no special protocol. Use WebSockets only when you genuinely need bidirectional, low-latency messaging (chat, games, collaborative editing). And for ordinary request/response APIs, neither — plain HTTP with good caching is the right answer.
SSL, TLS & HTTPS
This is the layer that turns HTTP into HTTPS. You rarely implement it yourself, but you must understand what it guarantees and where it runs.
- SSL (Secure Sockets Layer) — the original protocol for securing client–server communication. All its versions are now obsolete and disabled due to discovered vulnerabilities; the name persists only colloquially.
- TLS (Transport Layer Security) — SSL's modern successor. It does two things: encrypts data in transit so an eavesdropper can't read it and a tamperer can't silently alter it, and uses certificates to authenticate that the server really is who it claims to be. The current recommended version is TLS 1.3, which is both faster and stripped of legacy weak ciphers.
- HTTPS — simply HTTP carried inside a TLS connection. Identical protocol, identical semantics; everything in this manual applies unchanged, just with encryption underneath.
What the handshake actually achieves
Without drowning in cryptography, the TLS handshake establishes three things before any HTTP flows:
- Authentication — the server presents a certificate signed by a Certificate
Authority (CA) the client already trusts, cryptographically proving it owns
example.com. This is what stops you connecting to an impostor. (Certificates chain up to a small set of root CAs baked into your OS/browser.) - Key exchange — using asymmetric (public-key) cryptography, both sides agree on a shared symmetric session key without ever transmitting it in the clear. Asymmetric crypto is slow, so it's used only to bootstrap; the fast symmetric key encrypts the actual traffic.
- Encryption + integrity — from then on every byte is encrypted and carries a tamper-evidence check, so an attacker can neither read nor modify the conversation undetected.
A practical detail worth knowing: SNI (Server Name Indication) is the part of the handshake
where the client states which hostname it wants, so one IP serving many HTTPS sites can present the correct
certificate — the TLS-layer analogue of the Host header.
You almost never implement TLS in application code. It's terminated at the edge — by your hosting
platform, load balancer, or reverse proxy (nginx, Caddy, a cloud LB), frequently with certificates
auto-provisioned and renewed via Let's Encrypt. Your app then receives plain HTTP behind that boundary. This
is exactly why X-Forwarded-Proto (§18) matters — it's how your app learns the user was on HTTPS
— and why you set Strict-Transport-Security (§5) at the edge to force HTTPS in the first place.
// Serve HTTPS directly with a certificate + private-key pair.
err := http.ListenAndServeTLS(":443", "server.crt", "server.key", mux)
if err != nil {
log.Fatal(err)
}
// Production note: usually you terminate TLS at a proxy/LB and run plain HTTP behind it.
# uvicorn can serve TLS directly with a cert + key pair:
# uvicorn main:app --port 443 --ssl-certfile server.crt --ssl-keyfile server.key
import uvicorn
uvicorn.run("main:app", host="0.0.0.0", port=443,
ssl_certfile="server.crt", ssl_keyfile="server.key")
# Production note: usually terminate TLS at a proxy/LB and run plain HTTP behind it.
Debugging Cheat-Sheet
The payoff of understanding the machinery is fast triage: map a symptom straight to the subsystem and the headers that govern it.
| Symptom | Check |
|---|---|
| CORS error in the console (but server logs show 200) | Compare Origin vs Access-Control-Allow-Origin; verify the preflight
OPTIONS returns the right Allow-Methods/Headers; for cookies, check
Allow-Credentials and that origin isn't *. (§9) |
| Stale data being served | Inspect Cache-Control, ETag/If-None-Match,
Last-Modified; is the ETag bumped on every change? Is Vary set correctly?
(§12) |
| Lost updates / silent overwrites | Add optimistic locking: require If-Match with the ETag, reject mismatches with 412.
(§13) |
| Duplicate records on retry | POST is non-idempotent — require an Idempotency-Key and dedupe on it. (§7) |
| Redirected POST loses its body | You used 301/302 (clients may switch to GET); use 307/308 to preserve method & body. (§11) |
| Wrong format / language returned | Compare Accept, Accept-Language, Accept-Encoding against the
server's Content-*; consider 406. (§14) |
| 401 vs 403 confusion | 401 = not authenticated (send credentials); 403 = authenticated but not permitted. (§10) |
| 400 vs 422 | 400 = malformed syntax; 422 = valid syntax, failed business validation. (§10) |
| 502 / 503 / 504 | Infrastructure, not your handler — upstream unreachable / down / slow. Check the proxy and upstream health. (§10, §18) |
| A header you set isn't sent | You set it after writing the status or body. Always set headers first. (§3) |
| Real client IP shows as the proxy | Read X-Forwarded-For — but only when the request came from a trusted proxy. (§18) |
| HTTPS site generates http:// links | Build absolute URLs from X-Forwarded-Proto, not the local (proxied) scheme. (§18) |
| Connections exhausted under load | Tune keep-alive idle timeout and connection limits; watch for Slowloris via read-header timeout. (§16) |
Headers worth memorizing, grouped by job
- Negotiation:
Accept,Accept-Language,Accept-Encoding↔Content-Type,Content-Language,Content-Encoding - Representation:
Content-Type,Content-Length,Content-Range - Identity / auth:
Authorization,WWW-Authenticate,Cookie,Set-Cookie,Host,User-Agent - Caching / conditional:
Cache-Control,ETag,Last-Modified,If-None-Match,If-Match,If-Modified-Since,Vary - CORS:
Origin,Access-Control-Allow-Origin / -Methods / -Headers / -Credentials / -Max-Age - Security:
Strict-Transport-Security,Content-Security-Policy,X-Frame-Options,X-Content-Type-Options - Proxy:
X-Forwarded-For / -Proto / -Host,Forwarded,Via - Connection / transfer:
Connection,Keep-Alive,Upgrade,Transfer-Encoding,Range,Accept-Ranges
The whole flow in one breath: a client opens a connection — reusable (§16), usually TLS-encrypted (§20), often through a reverse proxy (§18) — and sends a self-contained request (§1): a method with safe/idempotent/cacheable semantics (§6), a URL, headers (§4–5), and an optional body (§3). The stateless server re-establishes identity (§8), checks authorization and preconditions (§10, §13), negotiates the best representation (§14), consults caches (§12), performs the work, and returns a status-coded response (§10) — possibly a redirect (§11), a partial range (§15), or a stream (§17). Internalize that loop and the headers that steer each step, and you can reason about and debug the overwhelming majority of backend HTTP you'll ever meet.
Grounded in MDN (developer.mozilla.org/en-US/docs/Web/HTTP) & RFC 9110 · Go 1.22+ / Python 3.11+ examples.