Request Aggregation
Request Aggregation
A BFF pattern where one inbound client request fans out to multiple downstream service calls in parallel, and the results are merged into a single, client-optimised response.
Core Idea
Request aggregation is the defining capability of a BFF: the client sends one request, the BFF calls N downstream services concurrently, and the BFF returns one combined response. Without aggregation, clients must make N separate requests, each incurring round-trip latency and exposing internal service boundaries.
In the reactive stack, Mono.zip() from Project-Reactor is the primary aggregation primitive. It subscribes to all upstream Monos simultaneously. Total latency equals the slowest upstream call, not the sum of all calls.
Key Principles
- Parallel, not sequential — all downstream calls must be initiated at the same time; sequential calls accumulate latency.
- Partial failure is acceptable — if Service B fails but Service A and C succeed, return a partial response rather than a 500. Use
onErrorResumeon each branch individually before zipping. - Timeout every call independently — each downstream call must have its own
.timeout(Duration)so a slow service cannot block the entire aggregation. - Collect before zip — when a downstream service returns a list (Flux), use
.collectList()to convert toMono<List<T>>before combining withMono.zip.
How It Works
Fixed N calls: Mono.zip
Mono<UserProfile> userMono = userService.getProfile(userId);
Mono<List<Order>> ordersMono = orderService.getOrders(userId);
Mono<Notifications> notifMono = notifService.getSummary(userId);
Mono<DashboardResponse> result = Mono.zip(userMono, ordersMono, notifMono)
.map(t -> new DashboardResponse(t.getT1(), t.getT2(), t.getT3()));Dynamic N calls: Flux.flatMap + collectList
When the number of IDs is not known at compile time:
Mono<List<ProductDetails>> result =
Flux.fromIterable(productIds)
.flatMap(id -> productService.getDetails(id)) // concurrent, default concurrency = 256
.collectList();To control concurrency (e.g., limit to 5 simultaneous calls):
Flux.fromIterable(productIds)
.flatMap(id -> productService.getDetails(id), 5) // second arg = maxConcurrency
.collectList();Partial Failure Handling
Apply onErrorResume before the zip, not after. This ensures one failed branch returns a fallback rather than failing the entire zip:
Mono<UserProfile> userMono = userService.getProfile(userId)
.timeout(Duration.ofMillis(500))
.onErrorResume(ex -> Mono.just(UserProfile.empty(userId))); // degraded, not null
Mono<List<Order>> ordersMono = orderService.getOrders(userId)
.timeout(Duration.ofMillis(500))
.onErrorResume(ex -> Mono.just(List.of())); // empty list, not error
return Mono.zip(userMono, ordersMono)
.map(t -> new DashboardResponse(t.getT1(), t.getT2()));If onErrorResume is placed after the zip, any single error cancels all branches.
Examples
- Dashboard endpoint — BFF calls UserService + OrderService + NotificationService in parallel; Angular receives one
DashboardResponseJSON object. - Product listing with enrichment — BFF calls ProductService for IDs, then fans out to EnrichmentService for each ID via
Flux.flatMap. - Checkout summary — BFF calls CartService + PricingService + InventoryService in parallel; partial failure from InventoryService shows items as "availability unknown".
See complete implementation in P3-BFF-Implementation-Patterns — Example 1.
Common Misconceptions
: It subscribes to all inputs simultaneously. Latency = max(all branches), not sum.Mono.zipis sequentialError in one branch fails silently: It cancels the whole zip and propagates the error. Guard each branch withonErrorResumefirst.:flatMapalways preserves orderflatMapdoes not preserve order; useconcatMapif order matters (at the cost of sequential execution).
Why It Matters
Aggregation eliminates the "chatty client" problem in microservice architectures. Without it, an Angular SPA making 4 separate API calls adds:
- 4× round-trip latency to the server
- 4× CORS preflight overhead
- Waterfall dependencies if calls share data
A BFF aggregating those 4 calls into 1 reduces client-perceived latency to ~max(service latencies) and hides internal service topology.
Related Concepts
| Concept | Relationship |
|---|---|
| BFF-Pattern | Aggregation is the core BFF value proposition |
| Project-Reactor | Mono.zip, Flux.flatMap, collectList are the implementation primitives |
| Spring-Cloud-Gateway | Routes backed by aggregation controllers sit behind SCG |
| Response-Transformation | Aggregated responses are often reshaped before returning |
| Circuit-Breaker-Pattern | Protects individual branches of an aggregation fan-out |
Sources
- projectreactor.io/docs —
Mono.zip,Flux.flatMapreference - P3-BFF-Implementation-Patterns (Phase 3 research, IMPL-01)
- P2-Spring-Boot-BFF-Stack (reactive stack rationale)