Operational API Patterns
Operational API Patterns
Cross-protocol operational concerns — pagination, idempotency, rate limiting, and partial responses — appear in REST, GraphQL, and gRPC-transcoded APIs alike. These four patterns define the contract between a server and its clients for safe navigation, safe retries, quota enforcement, and bandwidth optimisation.
Intent
REST, GraphQL, and gRPC-transcoded APIs share a set of cross-cutting operational concerns that sit above the protocol layer: how to traverse large result sets without re-scanning data (pagination), how to make non-idempotent operations safe to retry (idempotency keys), how to signal quota exhaustion and guide client back-off (rate limiting), and how to reduce unnecessary payload transfer (partial responses). These six cross-protocol concerns are tightly related — idempotency keys interact with retry logic, rate limiting shapes retry cadence, and partial responses reduce bandwidth so clients stay within rate limits.
This note covers all four patterns in per-pattern H2 sections. Each section opens with "When NOT to Use" — the vault's suitability-first convention — before explaining how the pattern works and providing TypeScript (Express 5.x) and Java (Spring Boot 3.x) skeletal examples.
When NOT to Use (note-level)
Operational contracts add overhead without benefit in these contexts:
- Simple internal APIs with known clients and small datasets — pagination over 50-100 records adds cursor logic, store lookups, and pageInfo encoding for no user-visible benefit. Full-list response is simpler and acceptable.
- Read-only endpoints where idempotency is inherent — GET, HEAD, and OPTIONS are idempotent by HTTP spec (RFC 9110). Adding an idempotency key header on GET requests is meaningless overhead.
- Low-traffic internal services where rate limiting adds complexity without protecting shared resources — rate limiting is a quota protection mechanism. If the service has a single trusted consumer and no shared quota to protect, the
Retry-Afterheader machinery adds latency with no protective benefit.
Pagination (OPER-01)
Pagination controls how a server breaks large result sets into sequential pages and how clients navigate those pages. Two primary models exist: offset pagination (simple, page-jump capable, performance-limited) and cursor pagination (stable, O(log n), sequential-only).
When NOT to Use
- Small result sets (< 100 records) where a full-list response is acceptable — pagination adds cursor encoding,
pageInfoconstruction, and limit-plus-one query tricks for no user-visible gain when the entire dataset fits in a single reasonably-sized payload. - Genuinely random-access UIs where offset page jumping is required and the dataset is small — if a UI needs "go to page 7 of 12" and the total record count stays under 10,000, offset pagination's page-jump capability is worth its tradeoffs. Above 10,000 records, switch to cursor pagination regardless.
Offset Pagination
Mechanism: Client sends ?page=3&limit=20 or ?offset=40&limit=20. Server translates to SQL: LIMIT 20 OFFSET 40.
Pros: Simple to implement. Supports page jumping — clients can request any page by number. Easy to calculate total page count when combined with a COUNT(*) query.
Cons:
- Data stability: Breaks on live data. If a new row is inserted before page 2 is fetched, all rows shift by one position and the client either sees a duplicate or misses a row.
- Performance cliff:
OFFSET 100000instructs the database to scan and discard 100,000 rows before returning the next 20. Performance degrades O(n) with offset value. Default to cursor pagination for any endpoint expected to serve > 10,000 records. The offset pagination performance cliff is the most common cause of list-endpoint timeouts in production.
Cursor Pagination
Mechanism: Client sends ?cursor=<opaque>&limit=20. Server decodes the cursor to recover the sort position and issues a keyset query:
WHERE (created_at, id) > ($last_created_at, $last_id)
ORDER BY created_at, id
LIMIT $3Pros:
- Stable under mutations — cursor points to a specific sort position, not a row count. Inserts and deletes before the cursor position do not affect subsequent pages.
- O(log n) index seek — the database uses a composite index on
(created_at, id)to seek directly to the cursor position. No rows are scanned and discarded.
Cons:
- No page jumping — a cursor encodes a position; there is no way to jump to page 7 without traversing pages 1-6.
- Sequential traversal only — cursors are directional. Forward pagination (
after) is universal; backward pagination (before) requires the sort order to be reversible.
Opaque cursor encoding: Clients must treat cursors as opaque strings — they must not parse, increment, or construct cursor values manually. Encode the sort key and unique ID as Base64 of JSON so clients cannot see a bare integer or timestamp and be tempted to construct values:
Buffer.from(JSON.stringify({created_at: "2024-01-01T12:00:00Z", id: 123})).toString('base64')
→ eyJjcmVhdGVkX2F0IjoiMjAyNC0wMS0wMVQxMjowMDowMFoiLCJpZCI6MTIzfQ==
Keyset Pagination
Keyset pagination is a precise variant of cursor pagination that uses the actual column values (sort key + unique ID) directly as the cursor, without a wrapper encoding. The keyset query is identical. From the client's perspective, both cursor and keyset pagination expose an opaque cursor string — the distinction is implementation-level, not API-contract-level. The vault uses "cursor pagination" as the general term covering both.
Relay Connection Spec
The GraphQL Relay Connection Specification (relay.dev/graphql/connections.htm) formalises cursor pagination as a typed GraphQL schema. It defines:
- Connection type — has
edges(list of Edge) andpageInfo(non-null PageInfo) fields - Edge type — has
node(the result item) andcursor(opaque string) fields - PageInfo type — has
hasPreviousPage,hasNextPage,startCursor,endCursor - Pagination arguments —
first/afterfor forward pagination,last/beforefor backward pagination
The Relay spec is GraphQL-specific, but the cursor-as-opaque-string principle applies equally to REST pagination. REST endpoints should follow the same contract: return an opaque nextCursor string, a hasNextPage boolean, and accept ?cursor=<opaque> as the pagination argument.
See GraphQL-API-Design for Relay Connection implementation in GraphQL resolvers (Phase 24 output).
Sequence Diagram
Cursor pagination flow with opaque next-page token propagation -- keyset query avoids OFFSET performance cliff.
sequenceDiagram
participant C as Client
participant API as API Server
participant DB as Database
Note over C,DB: Page 1 -- no cursor (initial request)
C->>API: GET /orders?limit=20
API->>DB: SELECT * FROM orders<br/>ORDER BY created_at, id<br/>LIMIT 21
Note right of DB: limit + 1 trick:<br/>fetch 21 to detect hasNextPage
DB-->>API: 21 rows returned
API-->>C: 200 OK<br/>{ items: [...20],<br/> nextCursor: "eyJjcmVhdGVkX2F0Ijo...",<br/> hasNextPage: true }
Note over C: Cursor is opaque --<br/>client must not parse or construct
Note over C,DB: Page 2 -- cursor from previous response
C->>API: GET /orders?cursor=eyJjcmVhdGVkX2F0Ijo...&limit=20
API->>API: Decode cursor to<br/>{created_at, id}
API->>DB: SELECT * FROM orders<br/>WHERE (created_at, id) > ($1, $2)<br/>ORDER BY created_at, id<br/>LIMIT 21
Note right of DB: Keyset seek: O(log n)<br/>via composite index --<br/>no rows scanned and discarded
DB-->>API: 15 rows returned
API-->>C: 200 OK<br/>{ items: [...15],<br/> nextCursor: null,<br/> hasNextPage: false }
Note over C: hasNextPage: false --<br/>pagination complete
TypeScript Example — Cursor Pagination (Express)
// Cursor pagination — opaque Base64 cursor encoding
// Concept from relay.dev/graphql/connections.htm — applies equally to REST
interface CursorPayload { created_at: string; id: number; }
function encodeCursor(payload: CursorPayload): string {
return Buffer.from(JSON.stringify(payload)).toString('base64');
}
function decodeCursor(cursor: string): CursorPayload {
return JSON.parse(Buffer.from(cursor, 'base64').toString('utf-8'));
}
// GET /orders?cursor=<opaque>&limit=20
app.get('/orders', async (req: Request, res: Response) => {
const limit = Math.min(Number(req.query.limit) || 20, 100);
const cursor = req.query.cursor as string | undefined;
const decoded = cursor ? decodeCursor(cursor) : null;
const rows = await db.query(
`SELECT id, created_at, ... FROM orders
WHERE ($1::timestamptz IS NULL OR (created_at, id) > ($1, $2))
ORDER BY created_at, id LIMIT $3`,
[decoded?.created_at ?? null, decoded?.id ?? null, limit + 1]
);
const hasNextPage = rows.length > limit;
const items = rows.slice(0, limit);
const nextCursor = hasNextPage
? encodeCursor({ created_at: items.at(-1)!.created_at, id: items.at(-1)!.id })
: null;
res.json({ items, nextCursor, hasNextPage });
});The limit + 1 trick: query for one extra row to determine hasNextPage without a separate COUNT(*). Discard the extra row before returning items.
Java Example — Cursor Pagination (Spring Boot)
// Cursor pagination — Java (Spring Boot)
// Opaque cursor using Base64 encoding of sort key + ID
record CursorPayload(Instant createdAt, Long id) {}
String encodeCursor(CursorPayload p) {
String json = """{"created_at":"%s","id":%d}""".formatted(p.createdAt(), p.id());
return Base64.getUrlEncoder().encodeToString(json.getBytes(StandardCharsets.UTF_8));
}
// Repository query using keyset condition
// WHERE (created_at, id) > (:createdAt, :id) ORDER BY created_at, id LIMIT :limit
// Spring Data JPA equivalent: use @Query with named parametersIdempotency Keys (OPER-02)
An idempotency key is a client-generated UUID sent in the Idempotency-Key request header on non-idempotent HTTP requests (POST, PATCH). The server stores the key and its associated response. On retry with the same key, the server returns the stored response without re-executing the operation.
This pattern makes unsafe retries safe — network failures, client timeouts, and mobile-app restarts can all cause a POST to be sent multiple times. Without idempotency keys, a payment is charged twice, an order is created twice, or an email is sent twice.
When NOT to Use
- GET, PUT, DELETE requests — already idempotent by HTTP spec (RFC 9110). GET retrieves without side effects. PUT replaces with the same value. DELETE removes once; subsequent calls return 404 but have no additional effect. Adding
Idempotency-Keyto these methods is meaningless. - Internal services where a message broker provides exactly-once delivery guarantees — if the calling system is a Kafka producer with EOS (Exactly-Once Semantics) enabled, the broker layer prevents duplicate delivery. Application-level idempotency keys add a second deduplication layer that may be redundant (though defence-in-depth is not harmful).
- Endpoints where duplicate execution is harmless — for example, a counter-increment endpoint where over-counting is acceptable, or a read-heavy endpoint with no write side effects.
How It Works
- Client generates key:
crypto.randomUUID()in Node.js;UUID.randomUUID()in Java. UUID v4 satisfies the IETF draft requirement for a structured string with sufficient entropy. - Client sends key:
POST /paymentswith headerIdempotency-Key: <uuid>. - Server checks store: Look up the key in a persistent store (Redis with 24h TTL, or a DB table with unique constraint on
(idempotency_key, user_id)). - If key found — response replay:
- Request still processing → return
409 Conflictwith message "Request in progress" - Request completed → return stored response verbatim (same status code, same body)
- Request failed → return original error response
- Request still processing → return
- If key not found — process and store: Execute the operation, store
{key → {status, body}}, return response.
Header name: Idempotency-Key — the standardised name per IETF draft-ietf-httpapi-idempotency-key-header-07 (October 2025). Stripe popularised the pattern with this exact header name. X-Idempotency-Key is non-standard — do not use it.
TTL: 24 hours is the Stripe convention. It balances storage cost against the window during which a client might retry (mobile clients may be offline for hours).
These are distinct patterns operating at different layers. The surface similarity — both use the word "idempotency" and both involve a "have I seen this before" store — causes frequent confusion.
| Dimension | Idempotency Key (HTTP) | Idempotent Consumer (EIP) |
|---|---|---|
| Layer | HTTP request/response | Message queue consumer |
| Who generates the key | Client (UUID in request header) | Message broker (messageId on envelope) |
| Where key is stored | Server-side key-value store | Consumer-side persistent store (DB, Redis) |
| Protocol | HTTP POST/PATCH | AMQP, Kafka, SQS messages |
| Pattern source | Stripe API design, IETF draft | Hohpe & Woolf EIP (2003) |
| Vault note | This note | Idempotent-Consumer |
HTTP idempotency keys are client-driven: the HTTP client generates the UUID, sends it in a header, and the HTTP server stores the response. Idempotent Consumer is broker-driven: the message broker assigns a messageId to each envelope, and the message consumer checks its own persistent store before processing.
TypeScript Example — Idempotency Key (Express)
// Idempotency Key pattern — Stripe-style UUID header
// Source: stripe.com/blog/idempotency + draft-ietf-httpapi-idempotency-key-header-07
// Client generates key: const key = crypto.randomUUID();
// Client sends: POST /payments with header Idempotency-Key: <uuid>
const idempotencyStore = new Map<string, { status: number; body: unknown }>();
// PRODUCTION NOTE: Replace Map with Redis (SET NX, TTL 24h) or DB unique constraint
// on (idempotency_key, user_id). In-memory Map is lost on restart — provides false safety.
app.post('/payments', async (req: Request, res: Response) => {
const key = req.headers['idempotency-key'] as string | undefined;
if (key) {
const cached = idempotencyStore.get(key);
if (cached) {
// Response replay — return stored response verbatim
return res.status(cached.status).json(cached.body);
}
}
const result = await paymentService.charge(req.body);
const responseBody = { paymentId: result.id, status: 'succeeded' };
if (key) {
idempotencyStore.set(key, { status: 201, body: responseBody });
// PRODUCTION: set TTL of 24 hours on the Redis key
}
res.status(201).json(responseBody);
});Java Example — Idempotency Key (Spring Boot)
// Idempotency Key pattern — Spring Boot
// Source: draft-ietf-httpapi-idempotency-key-header-07
// PRODUCTION NOTE: Replace ConcurrentHashMap with Redis (SET NX, TTL 24h) or
// DB unique constraint on (idempotency_key, user_id). In-memory store is lost on restart.
private final Map<String, ResponseEntity<?>> idempotencyStore = new ConcurrentHashMap<>();
@PostMapping("/payments")
ResponseEntity<PaymentResponse> createPayment(
@RequestHeader(value = "Idempotency-Key", required = false) String idempotencyKey,
@RequestBody PaymentRequest request) {
if (idempotencyKey != null) {
ResponseEntity<?> cached = idempotencyStore.get(idempotencyKey);
if (cached != null) {
// Response replay — return stored response verbatim
return (ResponseEntity<PaymentResponse>) cached;
}
}
PaymentResponse result = paymentService.charge(request);
ResponseEntity<PaymentResponse> response = ResponseEntity.status(201).body(result);
if (idempotencyKey != null) {
idempotencyStore.put(idempotencyKey, response);
// PRODUCTION: set TTL of 24 hours on the Redis key
}
return response;
}Rate Limiting Response Contract (OPER-03)
Rate limiting response contract defines the server's obligation when a client exceeds its quota: return 429 Too Many Requests with headers that tell the client when its quota resets and how to retry safely. A 429 without Retry-After forces clients to guess retry intervals, which causes thundering-herd retries when thousands of clients retry simultaneously after the same guess.
When NOT to Use
- Internal services behind a service mesh where rate limiting is handled at the infrastructure layer — Envoy, Istio, and AWS API Gateway all support rate limiting at the mesh/gateway layer. Duplicating rate limiting in application code adds latency with no additional protection when the infrastructure layer already enforces quotas.
- Single-consumer APIs where the consumer is trusted and rate limiting adds latency without protecting shared resources — rate limiting is a shared-resource protection mechanism. If there is one trusted consumer and no shared quota to protect, the
Retry-Afterheader machinery is operational overhead without benefit.
Response Headers
Mandatory status code: 429 Too Many Requests (RFC 6585, 2012). Do not use 503 Service Unavailable for rate limiting — 503 signals server unavailability (the server cannot respond at all), not quota exhaustion.
| Header | Standard | Semantics | Example Value |
|---|---|---|---|
Retry-After | RFC 9110 | Seconds to wait before retrying (delta-seconds preferred over HTTP-date to avoid clock skew) | Retry-After: 60 |
X-RateLimit-Limit | De-facto | Total requests allowed in the current window | X-RateLimit-Limit: 1000 |
X-RateLimit-Remaining | De-facto | Requests remaining in the current window | X-RateLimit-Remaining: 0 |
X-RateLimit-Reset | De-facto | Unix timestamp when the window resets | X-RateLimit-Reset: 1714000000 |
Retry-After is mandatory on every 429 response — it is the minimum information a client needs to retry correctly.
On standardisation: draft-ietf-httpapi-ratelimit-headers-10 (Standards Track, September 2025) proposes two consolidated headers — RateLimit-Policy and RateLimit — to replace the three X-RateLimit-* headers. As of March 2026, this draft has not been published as an RFC and tooling support is limited. Use X-RateLimit-* for now; note that the IETF draft proposes RateLimit-Policy / RateLimit as the eventual standard. Verify current draft status at https://datatracker.ietf.org/doc/draft-ietf-httpapi-ratelimit-headers/.
Client obligation: Treat Retry-After as a floor, not a fixed interval. Multiple clients receiving the same Retry-After: 60 and retrying at exactly 60 seconds create a thundering-herd burst that immediately triggers another 429. Use exponential backoff with jitter: wait Retry-After + random(0, base_delay * 2^attempt) seconds.
Cross-link: For server-side failure protection that complements rate limiting, see Circuit-Breaker-Pattern.
Partial Responses
Partial responses allow clients to request only a subset of response fields, reducing payload size and server serialisation cost. The mechanism is a ?fields=id,name,email query parameter on any resource endpoint.
No RFC standard defines the parameter name. fields is the de-facto convention used by Google APIs, LinkedIn, and GitHub. JSON:API uses fields[type]= as a named variant (sparse fieldsets) — the principle is the same. Do not add a full JSON:API explanation to REST endpoints unless you are implementing the full JSON:API specification.
Relationship to rate limiting: Partial responses reduce bandwidth and server CPU. Clients that request only the fields they render consume less quota per request and are more likely to stay within their rate limit window. This is the grouping rationale for partial responses as a subsection of the rate limiting section.
When to use: Large response objects where clients consistently need only a few fields — for example, a list endpoint returning 50 items where each item has 30 fields but the mobile client only renders 3. Not worth the implementation overhead for small payloads (< 5 fields total).
TypeScript Example — Rate Limiting + Partial Response (Express)
// Rate limiting response contract — 429 + mandatory headers
// Source: RFC 6585 (429 status), RFC 9110 (Retry-After)
app.use('/api', rateLimitMiddleware, (req, res, next) => next());
function sendRateLimitExceeded(res: Response, resetTimestamp: number): void {
const secondsUntilReset = Math.ceil((resetTimestamp - Date.now()) / 1000);
res
.status(429)
.set('Retry-After', String(secondsUntilReset)) // RFC 9110 — mandatory
.set('X-RateLimit-Limit', '1000') // quota per window
.set('X-RateLimit-Remaining', '0') // none left
.set('X-RateLimit-Reset', String(Math.floor(resetTimestamp / 1000))) // Unix epoch
.json({
type: 'https://api.example.com/errors/rate-limit-exceeded',
title: 'Rate Limit Exceeded',
status: 429,
detail: `Retry after ${secondsUntilReset} seconds`,
});
}
// Partial response — ?fields= query parameter convention
// Convention: Google APIs, LinkedIn, GitHub (no RFC standard)
function pickFields<T extends object>(obj: T, fields: string[]): Partial<T> {
return fields.reduce((acc, key) => {
if (key in obj) acc[key as keyof T] = obj[key as keyof T];
return acc;
}, {} as Partial<T>);
}
app.get('/users/:id', async (req: Request, res: Response) => {
const user = await userService.findById(req.params.id);
const fields = (req.query.fields as string)?.split(',') ?? null;
const response = fields ? pickFields(user, fields) : user;
res.json(response);
});
// GET /users/42?fields=id,name,email → { id: 42, name: "Alice", email: "alice@example.com" }Java Example — Rate Limit Response (Spring Boot)
// Rate limit response — Java (Spring Boot)
// Source: RFC 6585 (429), RFC 9110 (Retry-After), Spring Framework 6 ProblemDetail
@ResponseStatus(HttpStatus.TOO_MANY_REQUESTS)
@ExceptionHandler(RateLimitExceededException.class)
ResponseEntity<ProblemDetail> handleRateLimit(RateLimitExceededException ex) {
ProblemDetail problem = ProblemDetail
.forStatusAndDetail(HttpStatus.TOO_MANY_REQUESTS, "Rate limit exceeded");
problem.setType(URI.create("https://api.example.com/errors/rate-limit-exceeded"));
return ResponseEntity.status(429)
.header("Retry-After", String.valueOf(ex.getSecondsUntilReset()))
.header("X-RateLimit-Limit", "1000")
.header("X-RateLimit-Remaining", "0")
.header("X-RateLimit-Reset", String.valueOf(ex.getResetTimestampEpoch()))
.body(problem);
}Lineage
Lineage Backward: Idempotent-Consumer — messaging-layer idempotency (consumer-side deduplication store keyed on broker messageId) is the conceptual predecessor to HTTP idempotency keys. Both solve the "process this exactly once" problem; they differ in layer (HTTP vs messaging), key source (client vs broker), and storage location (server vs consumer).
Lineage Forward: Compensating-Transactions — saga compensation steps require idempotent retries. When a saga orchestrator retries a compensation step after a failure, the compensating action must not execute twice. Idempotency keys are the HTTP-layer mechanism that makes compensation retries safe.
This note is the middle node of Lineage Chain 16 (formalised in Phase 27 API-Protocol-Selection-MOC):
[[Idempotent-Consumer]] → [[Operational-API-Patterns]] (idempotency keys) → [[Compensating-Transactions]]
Related Concepts
| Concept | Relationship |
|---|---|
| REST-API-Design | REST is the primary protocol where these operational patterns apply; RMM, RFC 9457 error contracts, and versioning strategies in that note complement this one |
| Idempotent-Consumer | Messaging-layer counterpart to HTTP idempotency keys — distinct pattern, different layer, different key source |
| Circuit-Breaker-Pattern | Server-side failure protection that complements rate limiting; do not re-explain circuit breaker states here |
| Compensating-Transactions | Saga compensation requires idempotent retry capability — the forward-lineage target of this note |
| GraphQL-API-Design | Relay Connection Spec formalises cursor pagination for GraphQL; this note establishes the opaque cursor principle |
Related System Design
- Rate-Limiter-Design — Rate-Limiter-Design owns the system design algorithms (token bucket, sliding window, distributed state); Operational-API-Patterns owns the HTTP contract layer (429 status, Retry-After, X-RateLimit headers) — explicit scope boundary
- Load-Balancer — L7 load balancers can enforce simple rate limits per source IP at the infrastructure layer; application-aware rate limiting (per-user, per-endpoint) belongs to Operational-API-Patterns, not LB infrastructure
Related Security Patterns
- Input-Validation — Operational-API-Patterns covers rate limiting and idempotency at the HTTP contract layer; Input-Validation covers the trust boundary validation (schema validation, parameterized queries, output encoding) that must happen before any business logic including rate limit checks
- API-Key-Authentication — API key scoping and rate limit tiers are operationally linked; API-Key-Authentication covers the key lifecycle (dual-active rotation, revocation) while this note covers the rate limit headers and 429 response contract
Sources
Primary (HIGH confidence)
- IETF RFC 6585 — Additional HTTP Status Codes (429 defined): https://datatracker.ietf.org/doc/html/rfc6585
- IETF RFC 9110 — HTTP Semantics (Retry-After): https://datatracker.ietf.org/doc/html/rfc9110
- Relay GraphQL Cursor Connections Specification: https://relay.dev/graphql/connections.htm
- IETF draft-ietf-httpapi-idempotency-key-header-07 (October 2025): https://datatracker.ietf.org/doc/draft-ietf-httpapi-idempotency-key-header/
- IETF draft-ietf-httpapi-ratelimit-headers-10 (September 2025): https://datatracker.ietf.org/doc/draft-ietf-httpapi-ratelimit-headers/
- Stripe Idempotency Blog — canonical Stripe-style UUID header design: https://stripe.com/blog/idempotency
- Stripe API Reference — Idempotent Requests: https://docs.stripe.com/api/idempotent_requests
Secondary (MEDIUM confidence)
- Speakeasy — Pagination Best Practices in REST API Design: https://www.speakeasy.com/api-design/pagination — verified against Relay spec
- Slack Engineering — Evolving API Pagination at Slack: https://slack.engineering/evolving-api-pagination-at-slack/ — real-world Base64 cursor encoding example
- httptoolkit.com — Working with the new Idempotency Keys RFC: https://httptoolkit.com/blog/idempotency-keys/
- brandur.org — Implementing Stripe-like Idempotency Keys in Postgres: https://brandur.org/idempotency-keys
Tertiary (LOW confidence)
- MDN Web Docs — 429 Too Many Requests: https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/429 — informational, not normative