Request Aggregation

Request Aggregation

A BFF pattern where one inbound client request fans out to multiple downstream service calls in parallel, and the results are merged into a single, client-optimised response.


Core Idea

Request aggregation is the defining capability of a BFF: the client sends one request, the BFF calls N downstream services concurrently, and the BFF returns one combined response. Without aggregation, clients must make N separate requests, each incurring round-trip latency and exposing internal service boundaries.

In the reactive stack, Mono.zip() from Project-Reactor is the primary aggregation primitive. It subscribes to all upstream Monos simultaneously. Total latency equals the slowest upstream call, not the sum of all calls.


Key Principles

  1. Parallel, not sequential — all downstream calls must be initiated at the same time; sequential calls accumulate latency.
  2. Partial failure is acceptable — if Service B fails but Service A and C succeed, return a partial response rather than a 500. Use onErrorResume on each branch individually before zipping.
  3. Timeout every call independently — each downstream call must have its own .timeout(Duration) so a slow service cannot block the entire aggregation.
  4. Collect before zip — when a downstream service returns a list (Flux), use .collectList() to convert to Mono<List<T>> before combining with Mono.zip.

How It Works

Fixed N calls: Mono.zip

Mono<UserProfile>    userMono    = userService.getProfile(userId);
Mono<List<Order>>    ordersMono  = orderService.getOrders(userId);
Mono<Notifications>  notifMono   = notifService.getSummary(userId);
 
Mono<DashboardResponse> result = Mono.zip(userMono, ordersMono, notifMono)
    .map(t -> new DashboardResponse(t.getT1(), t.getT2(), t.getT3()));

Dynamic N calls: Flux.flatMap + collectList

When the number of IDs is not known at compile time:

Mono<List<ProductDetails>> result =
    Flux.fromIterable(productIds)
        .flatMap(id -> productService.getDetails(id))  // concurrent, default concurrency = 256
        .collectList();

To control concurrency (e.g., limit to 5 simultaneous calls):

Flux.fromIterable(productIds)
    .flatMap(id -> productService.getDetails(id), 5)   // second arg = maxConcurrency
    .collectList();

Partial Failure Handling

Apply onErrorResume before the zip, not after. This ensures one failed branch returns a fallback rather than failing the entire zip:

Mono<UserProfile> userMono = userService.getProfile(userId)
    .timeout(Duration.ofMillis(500))
    .onErrorResume(ex -> Mono.just(UserProfile.empty(userId)));  // degraded, not null
 
Mono<List<Order>> ordersMono = orderService.getOrders(userId)
    .timeout(Duration.ofMillis(500))
    .onErrorResume(ex -> Mono.just(List.of()));  // empty list, not error
 
return Mono.zip(userMono, ordersMono)
    .map(t -> new DashboardResponse(t.getT1(), t.getT2()));

If onErrorResume is placed after the zip, any single error cancels all branches.


Examples

  • Dashboard endpoint — BFF calls UserService + OrderService + NotificationService in parallel; Angular receives one DashboardResponse JSON object.
  • Product listing with enrichment — BFF calls ProductService for IDs, then fans out to EnrichmentService for each ID via Flux.flatMap.
  • Checkout summary — BFF calls CartService + PricingService + InventoryService in parallel; partial failure from InventoryService shows items as "availability unknown".

See complete implementation in P3-BFF-Implementation-Patterns — Example 1.


Common Misconceptions

  • Mono.zip is sequential: It subscribes to all inputs simultaneously. Latency = max(all branches), not sum.
  • Error in one branch fails silently: It cancels the whole zip and propagates the error. Guard each branch with onErrorResume first.
  • flatMap always preserves order: flatMap does not preserve order; use concatMap if order matters (at the cost of sequential execution).

Why It Matters

Aggregation eliminates the "chatty client" problem in microservice architectures. Without it, an Angular SPA making 4 separate API calls adds:

  • 4× round-trip latency to the server
  • 4× CORS preflight overhead
  • Waterfall dependencies if calls share data

A BFF aggregating those 4 calls into 1 reduces client-perceived latency to ~max(service latencies) and hides internal service topology.


ConceptRelationship
BFF-PatternAggregation is the core BFF value proposition
Project-ReactorMono.zip, Flux.flatMap, collectList are the implementation primitives
Spring-Cloud-GatewayRoutes backed by aggregation controllers sit behind SCG
Response-TransformationAggregated responses are often reshaped before returning
Circuit-Breaker-PatternProtects individual branches of an aggregation fan-out

Sources