Unique ID Generator Design

Unique ID Generator Design

System that generates globally unique identifiers across distributed nodes without central coordination bottlenecks.

Clarify First

Before designing, lock these assumptions with the interviewer:

  1. Globally unique or within-cluster? — Cross-datacenter uniqueness requires more coordination (or probabilistic guarantees); within-service unique is easier but limits future scaling.
  2. K-sortable (time-ordered)? — Snowflake IDs are time-ordered to the millisecond; UUID v4 is random and not sortable; this drives database index efficiency and range query capability.
  3. Scale? — At 1K IDs/sec, any approach works. At 10K+/sec, DB auto-increment becomes a single-writer bottleneck. At 100K+/sec, Snowflake or UUID v7 with no coordination is required.
  4. Unpredictable? — Sequential IDs enable enumeration attacks on user-facing resources. Snowflake IDs are partially predictable (time-ordered); UUID v4 is fully random.
  5. Maximum ID length? — 64-bit integer (Snowflake) fits in a standard bigint column. 128-bit UUID requires varchar(36) or a UUID column type. Some mobile clients and JavaScript environments have issues with 64-bit integers exceeding Number.MAX_SAFE_INTEGER.

Capacity Estimation

Derivation chain for a high-throughput e-commerce platform (2026):

Assumption: high-throughput e-commerce platform, 2026
Peak order events: 100K orders/second (Black Friday peak; matches
  Alibaba 11.11 2023 peak range)
ID generation target: 100K IDs/second sustained, 500K/second burst

Snowflake 64-bit anatomy:
  [1 sign bit][41-bit timestamp ms][10-bit machine ID][12-bit sequence]

  41 bits timestamp (millisecond precision):
    max value = 2^41 - 1 ms ≈ 69 years from epoch
    → covers 2026 + 69 = 2095 before rollover
  10 bits machine/datacenter ID:
    max value = 2^10 = 1024 worker nodes
  12 bits sequence number:
    max value = 2^12 = 4096 IDs per millisecond per node

Single node throughput:
  4096 IDs/ms × 1000 ms/sec = 4.096M IDs/second per worker node
  → single Snowflake worker exceeds our 500K/sec burst requirement
  → no coordination needed at runtime; each worker generates independently

Cross-reference: Capacity-Estimation for the shared estimation methodology.

Conclusion: One Snowflake worker node handles the full 500K/sec burst. For redundancy, deploy 2-4 workers with distinct machine IDs. Total capacity: 4-16M IDs/sec — 32× headroom over peak requirement.

Central Technical Problem

Clock skew in distributed Snowflake ID generation.

NTP (Network Time Protocol) adjustments on a Snowflake worker node can cause the system clock to move backward by milliseconds. If the current millisecond timestamp is less than the last recorded timestamp, Snowflake has a problem: generating an ID with the same (or earlier) timestamp and the same sequence number would produce a duplicate.

Snowflake must halt generation until the clock catches up:

// Snowflake clock skew guard (pseudocode)
if (currentMs < lastMs) {
  // Clock moved backward — wait until caught up
  waitUntilMs(lastMs);
  // If skew > threshold (e.g., 10 seconds), alert and halt
}

Small skew (<1 second): spin-wait (busy loop until currentMs >= lastMs). Acceptable because NTP typically corrects in sub-millisecond increments.

Large skew (>10 seconds): alert and halt. This indicates a deeper time synchronization failure. Continuing to generate IDs under large skew risks duplicate IDs even after the wait. Production Snowflake implementations treat this as a fatal operational condition requiring manual intervention.

This is the primary operational failure mode of Snowflake. Systems using Snowflake IDs must monitor clock skew on every worker node as part of their SLO.

Component Design

Algorithm Comparison

ApproachUniqueness GuaranteeSortableCoordination RequiredScalabilityWhen NOT to Use
DB auto-incrementWithin-DB unique (single writer)Yes (monotonic)Single DB write per IDBottleneck at >10K writes/secDistributed systems; high write throughput
UUID v4Probabilistic global unique (122 random bits; negligible collision probability)No (random)NoneUnlimited (stateless)When k-sortable IDs required; when 128-bit size is too large for storage or client
UUID v7 (RFC 9562, 2024)Probabilistic global uniqueYes (millisecond timestamp prefix)NoneUnlimited (stateless)When 64-bit IDs are required; UUID v7 is 128-bit
Snowflake (Twitter, 2010)Guaranteed unique within datacenter/worker assignmentYes (time-ordered to millisecond)Worker ID assignment at startup via ZooKeeper/etcd4.096M IDs/sec per nodeClock skew >10 seconds (must halt); when 128-bit IDs are acceptable
Ticket server (Flickr-style)Guaranteed unique (centralized counter)Yes (monotonic)Centralized ticket DB is always requiredBottleneck at ~10K IDs/sec; single point of failureHigh availability requirements; high throughput needs

Datacenter Awareness

Snowflake's 10-bit machine ID is typically split:

10-bit machine ID = [5 bits: datacenter ID][5 bits: worker ID within datacenter]
                  = 2^5 datacenters × 2^5 workers = 32 × 32 = 1024 total workers

Worker IDs must be assigned at startup via a coordination service. Two common approaches:

  • ZooKeeper / etcd: worker registers at startup, claims a unique worker ID, renews lease; on failure, lease expires and ID can be reclaimed. Adds operational dependency.
  • Static configuration: hard-coded per deployment unit (Kubernetes pod annotation or environment variable). Simpler, but requires discipline in deployment tooling to prevent duplicate IDs when deploying new nodes.

No two workers in the same datacenter may share the same worker ID — duplicates cause silent ID collisions with no runtime error.

System Diagram

Unique-ID-Generator-Design-diagram.excalidraw

Alternatives Considered

DecisionAlternative AAlternative BWhy Chosen Approach Wins
Snowflake for high-throughputDB auto-incrementUUID v44.096M IDs/sec per node; 64-bit; k-sortable for DB index efficiency; no runtime coordination
UUID v7 for no-coordination contextsUUID v4SnowflakeRFC 9562 (2024); time-ordered = better DB index locality; no worker ID management overhead; choose when 128-bit size is acceptable
ZooKeeper for worker ID assignmentStatic configetcdZooKeeper provides lease expiry and automatic reclaim; static config is simpler but operationally fragile on node replacement
Ticket server (Flickr)DB auto-incrementSnowflakeTicket server trades coordination bottleneck for strict monotonicity; acceptable for low-throughput systems needing guaranteed order

Recommended defaults:

  • High-throughput k-sortable (>10K IDs/sec): Snowflake
  • Low-coordination stateless (any throughput, 128-bit acceptable): UUID v7
  • Simple single-DB application (<10K writes/sec): DB auto-increment

Likely Follow-Up Questions

  1. What happens when NTP adjustment exceeds 10 seconds? — Production Snowflake implementations alert and halt ID generation, requiring manual clock correction or node restart. Alternatives: pre-generate a buffer of IDs ahead of time, or fall back to UUID v7 temporarily.
  2. How do you assign worker IDs without ZooKeeper? — Use static Kubernetes pod annotations or environment variables set at deploy time. Requires a registry (even a simple config map) to ensure no two workers share an ID across the entire fleet.
  3. Why not UUID v7 instead of Snowflake? — UUID v7 is 128-bit vs Snowflake's 64-bit. A bigint Snowflake ID fits in 8 bytes and indexes efficiently in B-tree structures. A UUID v7 requires 16 bytes. For databases with billions of records, the index size difference is significant.
  4. How do you handle multi-region deployments? — The 5-bit datacenter ID provides 32 distinct datacenter slots. Each region gets a datacenter ID range. Workers within the region use the worker ID bits. Cross-region uniqueness is guaranteed structurally, not via coordination.
  5. What if you need more than 4096 IDs per millisecond per node? — Deploy additional Snowflake workers (up to 1024 total). Alternatively, re-partition the bit layout: use fewer machine ID bits and more sequence bits for extreme throughput at the cost of fewer workers.
  6. How do you prevent ID enumeration for user-facing IDs? — Snowflake IDs are time-ordered and partially guessable. For user-facing IDs where enumeration is a security concern, use UUID v4 (fully random) or apply a reversible obfuscation (e.g., Hashids) to the Snowflake ID at the API boundary.

Existing Pattern Connections

Design DecisionExisting PatternRelationship
Worker ID coordination at startupBounded-ContextEach datacenter/service domain owns its worker ID range; the coordination boundary mirrors bounded context isolation — no cross-context worker ID sharing
Ticket server as centralised counterSingleton-PatternTicket server is a globally unique instance; inherits Singleton's testing pitfall (hard to parallelize) and availability pitfall (single point of failure)
UUID v4/v7 as AP choiceCAP-TheoremUUID generation requires no coordination = partition tolerant; trades globally ordered IDs (consistency) for availability — a deliberate AP tradeoff
Snowflake as CP choiceCAP-TheoremSnowflake halts generation on clock skew rather than risk duplicate IDs — consistency over availability; the explicit CAP tradeoff for sortable guaranteed-unique IDs