Content Delivery Network

Content Delivery Network

Geographically distributed network of edge servers that caches and serves content from locations physically close to end users, reducing latency and offloading origin server traffic.

When NOT to Use

  • Highly personalised responses that cannot be cached — every user's response is unique (e.g., authenticated dashboards with personalised data); CDN hit rate approaches 0%; the added hop latency through the CDN degrades performance without any caching benefit
  • Internal services with no geographic users — services accessed only within a corporate network or single datacenter have no geographic distribution need; CDN routing overhead (DNS resolution, anycast routing) is pure cost with no latency benefit
  • Rapidly mutating data (sub-second update frequency) — TTL-based caching will always be stale; the invalidation churn (purge API calls on every update) exceeds the benefit; serve this content directly from origin

Core Mechanism

Request routing and edge caching flow:

User → DNS/Anycast → Nearest Edge PoP
         ↓ (cache hit)
         Return cached content                     [fast path — no origin traffic]
         ↓ (cache miss, no origin shielding)
         Edge PoP → Origin Server → cache + return [cold start or expired content]
         ↓ (cache miss, origin shielding enabled)
         Edge PoP → Shield PoP → Origin → cache at both → return

A user request is routed to the nearest CDN edge Point of Presence (PoP) via DNS-based or anycast routing. On a cache hit, the edge serves the cached response directly. On a cache miss, the edge fetches from origin (or a shield PoP if origin shielding is enabled), caches the response according to the configured TTL and cache-control headers, and returns it to the user.

Component Diagram

Content-Delivery-Network-diagram.excalidraw

Key Variants

Pull CDN vs Push CDN

ModelContent pushed byOrigin trafficCache populationBest for
Pull (origin pull)CDN fetches on first request missOn every cache missLazy — on first user requestDynamic content, infrequently updated assets, developer simplicity
PushOperator explicitly publishes to CDN at deploy timeNone after push (zero origin traffic)Eager — before user requests arriveLarge static assets (video, software downloads), predictable content, expensive origin bandwidth

Pull CDN is the default for most web applications: simpler to operate, no explicit push step, but the first user per PoP experiences full origin latency (cold start miss). Suitable for HTML, CSS, JavaScript, API responses with cache-control headers.

Push CDN requires the operator to push content to all edge nodes at publish time. No cold start; no origin traffic. Suitable for binary assets (video files, OS images, large software downloads) where origin bandwidth is expensive and content is known before user requests arrive.

Design Decisions

Cache Invalidation Strategies

How to invalidate (expire or purge) cached content before natural TTL expiry:

  • TTL-based expiry: simplest strategy; set Cache-Control: max-age=N on origin responses; CDN edges serve cached content until TTL expires. Stale content is possible during the TTL window after a change. Use for content that changes on a known schedule or where brief staleness is acceptable.

  • Versioned URLs: embed a content hash in the URL (e.g., /app.a3f9c2.js, /styles.b7d1e4.css); cached entries never expire because the URL changes with every content change. Zero stale-content risk. Requires an asset pipeline that generates hashed URLs at build time. The gold standard for static assets.

  • Purge API: most CDNs provide an API to immediately invalidate a specific URL or URL pattern across all edge nodes globally. Latency for global propagation is typically seconds to a few minutes. Use for emergency invalidation (security incident, incorrect content published). Cannot be used for every content update at high frequency — API rate limits apply and propagation is not instantaneous.

  • Surrogate keys (cache tags): tag cached responses with logical keys at origin (e.g., Surrogate-Key: product-123 category-electronics); send a single purge request for the logical key to invalidate all cached responses carrying that tag simultaneously. Powerful for content with complex dependencies. Not universally supported: Cloudflare Cache-Tags, Varnish xkey, Fastly surrogate keys are the main implementations.

Origin Shielding

Without shielding, every edge PoP independently fetches from origin on a cache miss. With hundreds of edge PoPs, a high-traffic cache miss event results in hundreds of parallel requests to origin.

Origin shielding adds an intermediate CDN tier (shield PoP) between edge nodes and origin:

Edge PoP (N) → Shield PoP (1 per region) → Origin

Edge cache misses are forwarded to the shield PoP. If the shield PoP has the content cached, it serves the edge without reaching origin. Only shield PoP misses reach origin. This collapses parallel cache misses from multiple edge nodes into a single request to origin, dramatically reducing origin load.

Tradeoff: origin shielding adds one extra network hop (edge → shield → origin) on cold cache misses. Worth it when origin is under-provisioned or when CDN traffic volume is high enough that unshielded miss storms would overwhelm origin.

Pitfalls

Edge computing cold start latency

Compute at edge nodes (Cloudflare Workers, Lambda@Edge, Fastly Compute) reduces round-trip latency for personalisation, authentication checks, A/B testing decisions, and request manipulation. Constraint: the edge execution environment is limited — no persistent state, restricted runtime, memory limits, and CPU time limits. Tradeoff: cold start latency (first invocation of an edge function on a given PoP) vs reduced origin round-trip. Do not move complex stateful logic to edge; the execution constraints will bite in production.

Cache invalidation propagation delay

Purge API calls do not propagate instantaneously to all edge nodes globally. During the propagation window (seconds to minutes), edge nodes continue to serve the stale version. For content correctness requirements that cannot tolerate any stale window, versioned URLs are the only safe approach — they avoid invalidation entirely by deploying new URLs for every content change. The Purge API is appropriate for emergency cases, not routine deployment workflows.

Existing Pattern Connections

  • Consistent-Hashing — CDN edge selection (which PoP handles a request) and origin-to-shield routing often use consistent hashing for request distribution; the same ring mechanics from Phase 28 apply at the CDN routing layer
  • Distributed-Cache — a CDN is a geographically distributed cache specialised for static and semi-static content served to end users; Distributed-Cache covers in-datacenter cache tiers for application data; they operate at different layers of the same latency-reduction strategy
  • CAP-Theorem — CDN edge nodes serve eventually consistent content; the tradeoff between cache invalidation (consistency) and availability during origin failure maps directly to the CA vs CP spectrum; CAP-Theorem provides the formal framework for reasoning about this tradeoff