News Feed Design

News Feed Design

System that aggregates, ranks, and delivers a personalised timeline of posts from followed accounts, handling celebrity fan-out and cache layering at scale.

Clarify First

Before designing, lock these assumptions with the interviewer:

  1. Scale target? — 500M DAU social platform (Twitter/Facebook scale) vs 10M DAU niche network; drives sharding, fan-out worker count, and cache tier sizing.
  2. Feed ranking? — Reverse chronological (simpler; no ML pipeline) vs algorithmic relevance (ranking model required; adds latency); most production systems use algorithmic.
  3. Content types? — Text only vs text + images + video + reshares; media storage and CDN requirements diverge significantly.
  4. Celebrity threshold? — Define the follower count above which fan-out-on-write becomes impractical (typically 10K–10M); below this threshold, fan-out-on-write is the default path.
  5. Consistency requirement? — Is eventual consistency acceptable for feed freshness? A post may take seconds to appear in all followers' feeds — confirm this tradeoff with the interviewer.

Capacity Estimation

Derivation chain for a large-scale social platform (2026):

Assumption: 500M DAU (Twitter/Facebook scale, 2026 estimate)

Write path:
  average_posts_per_DAU_per_day = 0.1 (most users are readers, not posters)
  daily_posts = 500M x 0.1 = 50M posts/day
  write_QPS = 50M / 86,400 ~ 580 write QPS

Fan-out amplification (regular users):
  average_followers = 200 (median, not mean -- power law distribution)
  fan_out_writes = 580 x 200 = 116,000 cache write QPS
  -> Manageable; pre-built timeline cache is effective for regular users

Celebrity edge case:
  celebrity_followers = 10M
  celebrity_post_fan_out = 1 post x 10M writes = 10M cache writes per post
  at 1 post/hour: 2,778 fan-out writes/sec from ONE celebrity
  -> Breaks fan-out-on-write; requires pull path for celebrities

Cross-reference: Capacity-Estimation for the shared DAU-to-QPS-to-storage methodology.

Conclusion: Regular fan-out (116K cache write QPS) is manageable with a horizontal fan-out worker fleet. Celebrity fan-out (10M writes per post) is not — it requires a separate read path.

Central Technical Problem

The celebrity problem: fan-out-on-write generates O(followers) write amplification.

For a user with 10M+ followers, a single post triggers 10M cache write operations. At 1 post per hour from a single celebrity, this generates 2,778 cache writes per second — before considering any other celebrity. In a system with hundreds of celebrity accounts, aggregate fan-out write QPS can saturate the cache tier, causing write storms and unacceptable p99 latency across the entire platform.

Three approaches address this:

Approach 1: Fan-out-on-write (push model)

At post time, write the post to every follower's feed cache. Timeline reads are O(1) — the pre-built feed is served directly from cache.

  • Write cost: O(followers) per post
  • Read cost: O(1) per timeline read
  • Breaks at: 10M+ followers — 10M cache writes per post creates a write storm
  • Suitable for: Regular users with < 10K followers where write amplification is bounded

Approach 2: Fan-out-on-read (pull model)

At read time, fetch posts from all accounts the viewer follows and merge them into a timeline. No write amplification; read latency is the cost.

  • Write cost: O(1) per post (write to poster's own post store only)
  • Read cost: O(following_count) per timeline read — must query each followed account's post store and merge results
  • Breaks at: High following counts (e.g., following 5,000 accounts generates 5,000 queries on every timeline open)
  • Suitable for: Low-follower-count networks where following counts are bounded

Approach 3: Hybrid strategy (production resolution)

Fan-out-on-write for regular users (below threshold, typically 10K followers). Fan-out-on-read for celebrity accounts (above threshold). Timeline delivery merges the pre-built cache feed with a real-time pull from a celebrity post store at read time.

  • Regular user posts: Fanned out via worker queue to all follower feed caches — see Message-Queue for async fan-out mechanics
  • Celebrity posts: Written only to a celebrity post store (sharded by celebrity user_id)
  • Timeline read: Serve pre-built feed cache + merge N celebrity posts (where N = celebrities the viewer follows); N is bounded because most users follow < 10 celebrities
  • Why this works: Celebrity follow counts are high (many followers), but viewer celebrity follows are low (few celebrities per viewer) — merging at read time is cheap for the viewer; write amplification is eliminated for the celebrity

The timeline cache is a CQRS read model — post events project into per-user feed tables. See CQRS-Pattern for the read model pattern. Feed cache partitions across nodes using Consistent-Hashing to avoid hotspot routing. Post storage is sharded by user_id — see Database-Sharding for sharding strategy.

Feed update propagation across the fan-out worker fleet follows an event-driven choreography — see Choreography-Saga-Pattern for the event propagation model. Cache layer details (cache-aside strategy, TTL alignment) are covered in Distributed-Cache.

Component Design

[Client] --> [Load Balancer] --> [API Server] --> [Feed Cache (per user)]
                                      |                 |
                                      v                 v
                               [Message Queue]    [Post Store (sharded by user_id)]
                                      |
                                      v
                               [Fan-out Workers] --> [Feed Cache (regular users)]
                                                           |
                               [Celebrity Post Store] <----+
                               (queried at read time)

Component responsibilities:

  • Load Balancer — distributes read and write traffic across stateless API server replicas
  • API Server — handles post creation (write path) and feed fetch (read path); stateless; scales horizontally
  • Message Queue — receives post-created events; fan-out workers subscribe and write to follower feed caches; decouples post write latency from fan-out latency
  • Fan-out Workers — consume post events; write to each follower's feed cache entry; skip celebrity accounts (above threshold)
  • Feed Cache (per user) — pre-built per-user timeline; O(1) read; key: feed:{user_id}; TTL aligned with feed retention window
  • Post Store (sharded) — durable post storage sharded by user_id; read by fan-out workers; celebrity post store is a separate read path queried at timeline read time
  • Celebrity Post Store — dedicated store for posts from celebrity accounts; queried at read time and merged into the timeline response

System Diagram

News-Feed-Design-diagram.excalidraw

Alternatives Considered

DecisionAlternative AAlternative BWhy Hybrid Wins
Fan-out strategyFan-out-on-write (all users)Fan-out-on-read (all users)Write-only fails at 10M+ followers; read-only fails at high following counts; hybrid bounds both costs
Cache strategy for timelineCache-aside (read path only)Write-through (populate at post time)Write-through is fan-out-on-write; cache-aside is fan-out-on-read; hybrid uses both based on account type
Feed consistencyStrong consistency (synchronous fan-out)Eventual consistency (async fan-out via queue)Synchronous fan-out blocks the post write path; async fan-out decouples write latency from fan-out latency; eventual consistency is acceptable for feed freshness
Celebrity threshold10K followers1M followersLower threshold protects the cache tier more aggressively; higher threshold reduces read-time merge cost; 10K is a common production starting point

The feed system is an AP system under CAP-Theorem — it prioritises write availability and partition tolerance over strong consistency. Users may see slightly stale feeds during network partitions; this is acceptable for social content.

Likely Follow-Up Questions

  1. How do you handle a celebrity unfollowing scenario? — When a viewer unfollows a celebrity, their celebrity set is updated; next timeline read omits that celebrity's posts from the merge step. Feed cache entries for celebrity posts are not invalidated — they simply stop being included.
  2. How do you rank posts (chronological vs ML relevance)? — Chronological ranking is a sorted merge on created_at. Algorithmic ranking requires a ranking service that scores each candidate post against a viewer affinity model; feed cache stores candidate sets, and the ranking service applies scores at read time.
  3. What happens when the cache layer fails (cache stampede)? — Cache miss on all feed requests simultaneously saturates the post store. Mitigation: mutex-based cache warming (only one request rebuilds a cold cache entry; others wait), probabilistic early expiration, or a separate feed builder that proactively warms caches.
  4. How do you support reshares and comment aggregation? — Reshares are new post events with a reshare_of reference; they fan out like regular posts. Comment counts are aggregated as counters on the post record; heavy comment threads may use a separate comment service.
  5. How would you extend to support stories (ephemeral content with TTL)? — Stories have a 24-hour TTL; feed cache entries for stories use a shorter TTL aligned with story expiry; story fan-out follows the same push/pull hybrid as regular posts.
  6. How do you handle feed pagination? — Cursor-based pagination: the client sends last_seen_post_id; the server returns the next N posts with created_at < cursor; cursor is encoded as an opaque string to hide post ID implementation details.

Existing Pattern Connections

Design DecisionExisting PatternRelationship
Feed cache as pre-built timelineCQRS-PatternFeed cache is a CQRS read model; post events project into per-user feed tables via fan-out workers
Async fan-out via worker queueMessage-QueuePost events enqueued; fan-out workers consume and write to follower caches — decouples post write from fan-out latency
Celebrity post merged at read timeObserver-PatternCelebrity account is an observable; followers pull on demand rather than being pushed; hybrid of push (regular) and pull (celebrity)
Feed pagination with cursorCQRS-PatternRead model pagination: cursor points to last seen post_id; query projects forward from that position
Eventual consistency on feed updatesCAP-TheoremFeed system chooses AP: users may see slightly stale feeds; write availability and partition tolerance prioritised over strong consistency