Capacity Estimation

Back-of-envelope estimation methodology for deriving QPS, storage, and bandwidth requirements from user-level assumptions — the first step of every system design interview.

When NOT to Use

Not a production capacity planning tool — use load testing and real traffic data for production sizing; back-of-envelope estimates have a margin of error of up to 10x and cannot substitute for profiling under real traffic
Estimates are order-of-magnitude — a margin of error of 10x is acceptable in interviews; precise per-component sizing requires infrastructure telemetry and benchmarking
Do not carry capacity numbers from this note verbatim into case study notes — derive numbers from DAU for each specific system; the latency reference table and power-of-two table are shared, but QPS and storage numbers are system-specific

Core Mechanism

DAU-to-QPS Derivation Chain

The fundamental formula for translating a user-level assumption into a server-level throughput target:

QPS = DAU * avg_requests_per_user / 86_400

Where:

DAU — Daily Active Users (the user-level anchor; get this number first)
avg_requests_per_user — average API calls per user per day across all actions (read, write, search, like, etc.)
86_400 — seconds in a day (60 * 60 * 24); the constant denominator for any per-day-to-per-second conversion
Peak QPS multiplier: typically 2x–3x average; systems must handle bursts, not just averages
Read/write ratio heuristics: typical systems are 10:1 to 100:1 read-heavy; the ratio drives architectural decisions (read replicas, CQRS read model, cache tier sizing)

Worked example: social media platform with 300M DAU (2024 estimate)

// Step 1: requests per user per day
avg_requests = 10
// Reasoning: read timeline 3x + post 0.5x + like 3x + search 3.5x
// = 10 actions/day; this is a judgment call, not a fact — state assumptions

// Step 2: total requests per day
requests_per_day = 300M * 10 = 3B
// = 3,000,000,000 requests per day across all users

// Step 3: average QPS
avg_QPS = 3B / 86_400 ≈ 35,000 QPS
// = 3,000,000,000 / 86,400 = 34,722; round to ~35,000 QPS
// Use this as the baseline; do not skip showing the division

// Step 4: peak QPS (assume 2x average for a social feed; 3x for live events)
peak_QPS = 35,000 * 2 = 70,000 QPS
// The multiplier accounts for primetime traffic bursts; state the multiplier explicitly

// Step 5: read/write split (typical: 90% reads for a social timeline)
read_QPS  = 70,000 * 0.9 = 63,000 QPS
write_QPS = 70,000 * 0.1 = 7,000 QPS
// The 90/10 split motivates separate read and write scaling paths
// → read_QPS drives cache sizing and read replica count
// → write_QPS drives write throughput requirements and leader node sizing

Bare numbers without reasoning chain

Never state "12K QPS" without showing the derivation: "10M DAU * 100 req/day / 86,400s = 11,574 QPS, round to ~12K QPS." Bare numbers cannot be audited or adjusted when assumptions change. In a system design interview, showing the derivation demonstrates that you understand the estimation methodology, not just the answer.

Storage and Bandwidth Formulas

Storage formula:

total_storage = storage_per_record * records_per_day * retention_days

Where retention_days is a product requirement (e.g., 5 years = 1,825 days) and storage_per_record is the average serialised size of one entity including metadata.

Bandwidth formula:

bandwidth = throughput * message_size

Where throughput is the write QPS (for ingress bandwidth) or the read QPS (for egress bandwidth) and message_size is the average response payload size.

Worked example:

// If each record is 1 KB, 100K writes/day, 5-year retention:
total_storage = 1 KB * 100,000 * 1,825 days
             = 1,000 bytes * 100,000 * 1,825
             = 182,500,000,000 bytes
             ≈ 182 GB
// Round to ~200 GB accounting for metadata and index overhead (10–20%)
// At this scale a single PostgreSQL primary is sufficient; sharding is premature

Reference Tables

Power-of-Two Table

These values are the common currency of back-of-envelope storage estimates. Memory all quantities in this table — an interviewer expects instant recall.

Power	Approximate Value	Full Name	Short Name
2^10	1 Thousand	1 Kilobyte	1 KB
2^20	1 Million	1 Megabyte	1 MB
2^30	1 Billion	1 Gigabyte	1 GB
2^40	1 Trillion	1 Terabyte	1 TB
2^50	1 Quadrillion	1 Petabyte	1 PB

Latency Numbers to Know (2024 reference)

Use these to reason about what a system "can" and "cannot" do within a given SLA. The order-of-magnitude differences (ns vs ms) are the critical insight: memory access is 6 orders of magnitude faster than a cross-datacenter round trip.

Operation	Latency
L1 cache reference	0.5 ns
L2 cache reference	7 ns
Main memory reference	100 ns
SSD random read (4KB)	100 microseconds
Network round trip (same DC)	500 microseconds
Disk seek	10 ms
Network round trip (cross-DC)	150 ms

Stale reference numbers

Latency numbers shift with hardware generations. Annotate with year (e.g., "2024 reference"). SSD random read was 150 microseconds a decade ago; current NVMe drives approach 10 microseconds for sequential reads. The table above reflects current NAND SSD random read; NVMe sequential read is significantly faster. When using numbers in a design, state the year so estimates can be updated as hardware improves.

Component Diagram

Capacity-Estimation-diagram.excalidraw

Pitfalls

The two [!warning] callouts above cover the primary pitfalls (bare numbers without reasoning chain, and stale reference numbers). Two secondary pitfalls:

Confusing average QPS with peak QPS: Sizing systems for average throughput means they fail under peak load. Always derive peak QPS explicitly and size for peak, not average. State the peak multiplier assumption (2x for gradual growth, 3x for bursty traffic patterns like live events or flash sales).
Ignoring compression and encoding: Real payloads are compressed (gzip, zstd) and encoded (protobuf vs JSON). A 1 KB JSON record may compress to 300 bytes. For bandwidth estimates, apply a compression ratio appropriate to the content type; images and video have minimal compressibility, structured JSON compresses well (3–5x).

Existing Pattern Connections

CQRS-Pattern — read/write ratio analysis (10:1 to 100:1) directly motivates CQRS's separate read model; the estimation step quantifies the read/write asymmetry that justifies the pattern: when write_QPS is 7K and read_QPS is 63K, a shared read/write data model creates contention that a separate read model (and read replica or cache tier) resolves
API-Protocol-Selection-MOC — HTTP request sizing (JSON vs protobuf vs GraphQL response shape) informs bandwidth estimates; protocol choice affects per-request byte cost; a protobuf-encoded payload is typically 3–5x smaller than the equivalent JSON, which directly reduces egress bandwidth and shifts the read bandwidth calculation
Operational-API-Patterns — pagination and rate limiting strategies affect per-request payload size and throughput calculations; a paginated endpoint returning 20 items per page produces very different storage and bandwidth numbers than one returning 200 items; rate limiting constrains write_QPS at the API layer, which must be accounted for when sizing write throughput

Capacity Estimation

Capacity Estimation

When NOT to Use

Core Mechanism

DAU-to-QPS Derivation Chain

Storage and Bandwidth Formulas

Reference Tables

Component Diagram

Pitfalls

Existing Pattern Connections

Backlinks

Linked mentions

Capacity Estimation

Tags

Capacity Estimation

When NOT to Use

Core Mechanism

DAU-to-QPS Derivation Chain

Storage and Bandwidth Formulas

Reference Tables

Component Diagram

Pitfalls

Existing Pattern Connections

Backlinks

Linked mentions