P3 — Core BFF Implementation Patterns
P3 — Core BFF Implementation Patterns
Research Question
What are the concrete implementation patterns for a production-grade BFF built on Spring Boot 3.4.x + Spring Cloud Gateway 4.2.x, covering request aggregation, response transformation, protocol translation, error normalisation, timeouts, retries, and circuit breaking?
Why This Matters
Phase 2 established the technology stack and rationale. Phase 3 provides the runnable implementation: the code and configuration a team actually ships. Without these patterns, teams either reinvent the wheel (inconsistent error handling, ad-hoc timeout logic) or under-implement resilience (no circuit breakers, fail-fast absent). These patterns are the difference between a BFF that works in a demo and one that survives production load.
Background
All implementations in this phase assume:
- Java 21 (record types, pattern matching, virtual threads available but not used here — reactive model is preferred)
- Spring Boot 3.4.3
- Spring Cloud 2024.0.1 (Moore)
- Spring Cloud Gateway 4.2.x
- Project Reactor (Reactor Core 3.6.x)
- Resilience4j via
spring-cloud-starter-circuitbreaker-reactor-resilience4j
Cross-references: BFF-Pattern · Spring-Cloud-Gateway · Project-Reactor · Reactive-Programming
Findings
IMPL-01: Request Aggregation
See atomic topic note: Request-Aggregation
The BFF calls multiple downstream services in parallel and merges results into one client-optimised response. The key Reactor primitive is Mono.zip(), which subscribes to all upstream Monos simultaneously. Total latency = max(all branch latencies), not sum.
Critical design decisions:
| Decision | Correct Approach | Incorrect Approach |
|---|---|---|
| Parallelism | Mono.zip(a, b, c) — all subscribed at once | a.flatMap(_ -> b).flatMap(_ -> c) — sequential |
| Partial failure | onErrorResume on each branch before zip | Single onErrorResume after zip |
| Per-call timeout | .timeout(Duration) on each branch Mono | Single global timeout on the zip result |
| List upstream | flux.collectList() before zip | Cannot zip a Flux directly |
| Dynamic fan-out | Flux.fromIterable(ids).flatMap(id -> call(id), maxConcurrency) | Loop with block() inside |
Fan-out with dynamic IDs:
// Controlled concurrency: max 5 simultaneous upstream calls
Mono<List<ProductDetails>> enriched =
Flux.fromIterable(productIds)
.flatMap(id -> productClient.getDetails(id)
.timeout(Duration.ofMillis(500))
.onErrorResume(ex -> Mono.just(ProductDetails.unavailable(id))),
5) // maxConcurrency = 5
.collectList();IMPL-02: Response Transformation
See atomic topic note: Response-Transformation
BFF transforms downstream responses into client-centric shapes. Three levels of transformation:
- DTO projection — explicit Java record with
from(DownstreamModel)factory method. Cleanest for greenfield. - Jackson
@JsonView— per-client field visibility on a shared response class. Useful for retrofit scenarios. ModifyResponseBodyGatewayFilterFactory— applies JSON field transformations in SCG filter chain, no controller code required.
Field filtering decision matrix:
| Scenario | Recommended approach |
|---|---|
| New BFF controller, custom aggregation | Explicit DTO projection (AngularProductView.from(product)) |
| Single controller serves multiple client types | Jackson @JsonView with view hierarchy |
| SCG-proxied route, no controller, strip fields | ModifyResponseBodyGatewayFilterFactory with custom filter |
| Date/time format normalisation globally | Jackson JavaTimeModule + WRITE_DATES_AS_TIMESTAMPS=false |
IMPL-03: Protocol Translation
A BFF may receive HTTP REST from clients but need to call downstream services via gRPC or publish to Kafka.
REST → gRPC:
Using net.devh:grpc-spring-boot-starter with reactive gRPC via reactor-grpc:
@RestController
@RequestMapping("/api/inventory")
public class InventoryController {
// Reactive gRPC stub (generated from .proto with reactor-grpc plugin)
private final ReactorInventoryServiceGrpc.ReactorInventoryServiceStub grpcStub;
public InventoryController(ManagedChannel channel) {
this.grpcStub = ReactorInventoryServiceGrpc.newReactorStub(channel);
}
@GetMapping("/{sku}/stock")
public Mono<StockResponse> getStock(@PathVariable String sku,
@RequestHeader("Authorization") String auth) {
StockRequest request = StockRequest.newBuilder()
.setSku(sku)
.build();
return grpcStub.getStock(request)
.map(grpcResponse -> new StockResponse(
grpcResponse.getSku(),
grpcResponse.getQuantity(),
grpcResponse.getWarehouseId()
))
.timeout(Duration.ofSeconds(2))
.onErrorResume(StatusRuntimeException.class, ex ->
Mono.error(new ServiceUnavailableException("Inventory service unavailable")));
}
public record StockResponse(String sku, int quantity, String warehouseId) {}
}REST → Kafka (fire-and-forget with correlation ID):
@RestController
@RequestMapping("/api/orders")
public class OrderCommandController {
private final ReactiveKafkaProducerTemplate<String, OrderCreatedEvent> kafkaTemplate;
public OrderCommandController(ReactiveKafkaProducerTemplate<String, OrderCreatedEvent> kafkaTemplate) {
this.kafkaTemplate = kafkaTemplate;
}
@PostMapping
public Mono<ResponseEntity<OrderAcceptedResponse>> placeOrder(
@RequestBody @Validated PlaceOrderRequest request,
@RequestHeader("Authorization") String auth) {
String correlationId = UUID.randomUUID().toString();
OrderCreatedEvent event = new OrderCreatedEvent(
correlationId,
request.userId(),
request.items(),
Instant.now()
);
return kafkaTemplate.send("orders.created", correlationId, event)
.map(result -> ResponseEntity
.accepted()
.body(new OrderAcceptedResponse(correlationId, "Order received")))
.onErrorResume(ex -> Mono.error(
new ServiceUnavailableException("Order queue temporarily unavailable")));
}
public record OrderAcceptedResponse(String correlationId, String message) {}
}The client receives 202 Accepted with a correlationId it can use to poll for order status.
IMPL-04: Field Filtering / Response Shaping
Interface segregation for API responses prevents the "fat DTO" problem where all clients receive all fields regardless of need.
Projection pattern:
// Full domain model from downstream (20+ fields)
public record Product(
String id, String internalSku, String name, String description,
BigDecimal price, String currencyCode, String categoryId,
Instant createdAt, Instant updatedAt, String createdBy, String updatedBy,
boolean active, int stockCount, String supplierId, String warehouseRegion,
Map<String, String> attributes, List<String> tags, String imageBaseUrl,
String thumbnailUrl, String fullImageUrl
) {}
// Angular client view — 6 fields only
public record AngularProductView(
String id,
String name,
String description,
BigDecimal price,
String currencyCode,
String thumbnailUrl
) {
public static AngularProductView from(Product p) {
return new AngularProductView(
p.id(), p.name(), p.description(),
p.price(), p.currencyCode(), p.thumbnailUrl()
);
}
}
// Mobile client view — 4 fields, price as formatted string
public record MobileProductSummary(
String id,
String name,
String formattedPrice,
String thumbnailUrl
) {
public static MobileProductSummary from(Product p, String locale) {
NumberFormat fmt = NumberFormat.getCurrencyInstance(Locale.forLanguageTag(locale));
fmt.setCurrency(Currency.getInstance(p.currencyCode()));
return new MobileProductSummary(
p.id(), p.name(),
fmt.format(p.price()),
p.thumbnailUrl()
);
}
}The controller returns the appropriate view based on the client type (determined by X-Client-Type header, route, or OAuth2 claim):
@GetMapping("/products/{id}")
public Mono<AngularProductView> getProduct(@PathVariable String id) {
return productService.findById(id)
.map(AngularProductView::from);
}IMPL-05: Error Normalisation
Problem: Downstream services return inconsistent error formats:
- Service A: RFC 7807 Problem Details (
{"type": "...", "title": "...", "status": 404}) - Service B: Custom JSON (
{"error": "NOT_FOUND", "description": "..."}) - Service C: HTML 500 page (from a legacy service)
Solution: BFF normalises all errors into one envelope before sending to the client.
Normalised error envelope:
{
"code": "ORDER_NOT_FOUND",
"message": "Order with ID abc123 was not found",
"traceId": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"timestamp": "2026-03-06T14:30:00Z",
"path": "/api/orders/abc123"
}Implementation: see Example 3 below (GlobalExceptionHandler).
The onErrorResume operator in reactive chains handles normalisation at the call site:
Mono<Order> orderMono = orderService.getOrder(orderId)
.onErrorMap(WebClientResponseException.NotFound.class,
ex -> new OrderNotFoundException(orderId))
.onErrorMap(WebClientResponseException.class,
ex -> new DownstreamServiceException("order-service", ex.getStatusCode(), ex))
.onErrorMap(TimeoutException.class,
ex -> new ServiceTimeoutException("order-service"));The @RestControllerAdvice then catches the normalised exceptions and formats the envelope.
IMPL-06: Timeout and Retry Strategies
Timeout configuration philosophy:
- Per-route timeout via SCG filter (coarse-grained, applies to entire route including retries)
- Per-upstream-call timeout via
.timeout(Duration)in Reactor (fine-grained, applies to one downstream call) - Use both: SCG timeout as the outer boundary, Reactor timeout on each branch of a
Mono.zip
Retry safety rules:
- ONLY retry idempotent methods:
GET,HEAD,OPTIONS - NEVER retry:
POST,PUT,PATCH,DELETE(may cause duplicate writes) - Exponential backoff: first retry at 10ms, doubles to max 50ms to avoid thundering herd
- Retry on:
BAD_GATEWAY (502),SERVICE_UNAVAILABLE (503),GATEWAY_TIMEOUT (504)— infrastructure failures - Never retry on:
BAD_REQUEST (400),UNAUTHORIZED (401),NOT_FOUND (404)— application-level errors
Configuration: see Example 4 below (application.yml).
IMPL-07: Circuit Breaker with Resilience4j
See atomic topic notes: Circuit-Breaker-Pattern · Resilience4j
SCG circuit breaker filter + Resilience4j configuration flow:
Request
│
▼
[SCG CircuitBreaker Filter] ──── circuit OPEN? ──yes──▶ [Fallback URI Controller]
│ │
│ circuit CLOSED │
▼ │
[Downstream Service] │
│ │
│ failure? │
▼ │
[Resilience4j counts failure] │
│ │
│ threshold exceeded? │
▼ │
[Circuit transitions to OPEN] ───────────────────────────────────┘
Fallback controller pattern:
@RestController
@RequestMapping("/fallback")
public class FallbackController {
@GetMapping("/orders")
public ResponseEntity<ErrorResponse> ordersFallback(ServerWebExchange exchange) {
String traceId = (String) exchange.getAttributes()
.getOrDefault("X-Correlation-Id", UUID.randomUUID().toString());
return ResponseEntity
.status(HttpStatus.SERVICE_UNAVAILABLE)
.body(new ErrorResponse(
"ORDER_SERVICE_UNAVAILABLE",
"Order service is temporarily unavailable. Please try again shortly.",
traceId,
Instant.now().toString(),
exchange.getRequest().getPath().value()
));
}
}Code Examples
Example 1: Request Aggregation Service
Complete service calling UserService + OrderService + NotificationService in parallel, with partial failure handling.
package com.example.bff.dashboard;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Service;
import org.springframework.web.reactive.function.client.WebClient;
import org.springframework.web.reactive.function.client.WebClientResponseException;
import reactor.core.publisher.Flux;
import reactor.core.publisher.Mono;
import java.time.Duration;
import java.time.Instant;
import java.util.List;
/**
* BFF aggregation service that fans out to three downstream services in parallel.
* Partial failure is handled by onErrorResume on each branch independently.
* Total latency = max(userLatency, ordersLatency, notifLatency).
*
* Dependencies: spring-boot-starter-webflux, spring-cloud-loadbalancer
*/
@Service
public class DashboardAggregationService {
private static final Logger log = LoggerFactory.getLogger(DashboardAggregationService.class);
private static final Duration BRANCH_TIMEOUT = Duration.ofMillis(500);
private final WebClient userClient;
private final WebClient orderClient;
private final WebClient notificationClient;
public DashboardAggregationService(WebClient.Builder builder) {
// lb:// URIs resolved via Spring Cloud LoadBalancer (replaces deprecated Ribbon)
this.userClient = builder.baseUrl("lb://user-service").build();
this.orderClient = builder.baseUrl("lb://order-service").build();
this.notificationClient = builder.baseUrl("lb://notification-service").build();
}
/**
* Aggregates user profile + recent orders + notification count into one dashboard.
*
* @param userId the authenticated user's ID
* @param bearerToken the downstream OAuth2 token (forwarded from client via TokenRelay)
* @return aggregated DashboardResponse, with empty/default values for failed branches
*/
public Mono<DashboardResponse> getDashboard(String userId, String bearerToken) {
// Branch 1: fetch user profile; degrade to empty profile on failure
Mono<UserProfile> userMono = fetchUserProfile(userId, bearerToken)
.timeout(BRANCH_TIMEOUT)
.onErrorResume(ex -> {
log.warn("User service failed for userId={}, using degraded fallback: {}", userId, ex.getMessage());
return Mono.just(UserProfile.empty(userId));
});
// Branch 2: fetch recent orders; degrade to empty list on failure
Mono<List<Order>> ordersMono = fetchRecentOrders(userId, bearerToken)
.timeout(BRANCH_TIMEOUT)
.onErrorResume(ex -> {
log.warn("Order service failed for userId={}, returning empty orders: {}", userId, ex.getMessage());
return Mono.just(List.of());
});
// Branch 3: fetch notification count; degrade to zero on failure
Mono<Integer> notifMono = fetchNotificationCount(userId, bearerToken)
.timeout(BRANCH_TIMEOUT)
.onErrorResume(ex -> {
log.warn("Notification service failed for userId={}, returning 0: {}", userId, ex.getMessage());
return Mono.just(0);
});
// All three subscribed simultaneously. Mono.zip fails fast if any branch errors
// BEFORE onErrorResume — that's why we apply onErrorResume on each branch above,
// not after the zip.
return Mono.zip(userMono, ordersMono, notifMono)
.map(tuple -> new DashboardResponse(
userId,
tuple.getT1(),
tuple.getT2(),
tuple.getT3(),
Instant.now()
));
}
/**
* Fan-out pattern for a dynamic list of product IDs.
* Uses Flux.flatMap with bounded concurrency (max 5 simultaneous calls).
*/
public Mono<List<ProductDetails>> enrichProducts(List<String> productIds, String bearerToken) {
return Flux.fromIterable(productIds)
.flatMap(id ->
fetchProductDetails(id, bearerToken)
.timeout(BRANCH_TIMEOUT)
.onErrorResume(ex -> {
log.warn("Product details fetch failed for id={}: {}", id, ex.getMessage());
return Mono.just(ProductDetails.unavailable(id));
}),
5 // maxConcurrency: max 5 simultaneous upstream calls
)
.collectList();
}
// --- Private fetch methods ---
private Mono<UserProfile> fetchUserProfile(String userId, String token) {
return userClient.get()
.uri("/users/{id}", userId)
.headers(h -> h.setBearerAuth(token))
.retrieve()
.onStatus(status -> status.is4xxClientError(),
response -> response.bodyToMono(String.class)
.flatMap(body -> Mono.error(
new WebClientResponseException(
response.statusCode().value(), body, null, null, null))))
.bodyToMono(UserProfile.class);
}
private Mono<List<Order>> fetchRecentOrders(String userId, String token) {
return orderClient.get()
.uri(uri -> uri
.path("/orders")
.queryParam("userId", userId)
.queryParam("limit", 10)
.queryParam("sort", "createdAt,desc")
.build())
.headers(h -> h.setBearerAuth(token))
.retrieve()
.bodyToFlux(Order.class)
.collectList();
}
private Mono<Integer> fetchNotificationCount(String userId, String token) {
return notificationClient.get()
.uri("/notifications/count?userId={id}&unreadOnly=true", userId)
.headers(h -> h.setBearerAuth(token))
.retrieve()
.bodyToMono(NotificationCountResponse.class)
.map(NotificationCountResponse::count);
}
private Mono<ProductDetails> fetchProductDetails(String productId, String token) {
return userClient.get() // reuse any client; in production use a dedicated productClient
.uri("/products/{id}", productId)
.headers(h -> h.setBearerAuth(token))
.retrieve()
.bodyToMono(ProductDetails.class);
}
// --- DTOs (in production: separate files in dto/ package) ---
public record UserProfile(String id, String name, String email, String avatarUrl) {
public static UserProfile empty(String id) {
return new UserProfile(id, "Unknown", "", null);
}
}
public record Order(String orderId, String status, double totalAmount, String currencyCode) {}
public record ProductDetails(String id, String name, boolean available) {
public static ProductDetails unavailable(String id) {
return new ProductDetails(id, "Unknown", false);
}
}
private record NotificationCountResponse(int count) {}
public record DashboardResponse(
String userId,
UserProfile user,
List<Order> recentOrders,
int unreadNotifications,
Instant generatedAt
) {}
}package com.example.bff.dashboard;
import org.springframework.http.ResponseEntity;
import org.springframework.security.core.annotation.AuthenticationPrincipal;
import org.springframework.security.oauth2.jwt.Jwt;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import reactor.core.publisher.Mono;
/**
* REST controller exposing the dashboard aggregation endpoint.
* Extracts userId and bearer token from the JWT principal (set by Spring Security).
*/
@RestController
@RequestMapping("/api/dashboard")
public class DashboardController {
private final DashboardAggregationService aggregationService;
public DashboardController(DashboardAggregationService aggregationService) {
this.aggregationService = aggregationService;
}
@GetMapping
public Mono<ResponseEntity<DashboardAggregationService.DashboardResponse>> getDashboard(
@AuthenticationPrincipal Jwt jwt) {
String userId = jwt.getSubject();
String token = jwt.getTokenValue();
return aggregationService.getDashboard(userId, token)
.map(ResponseEntity::ok);
}
}Example 2: Response Transformation Filter
Custom GatewayFilterFactory that strips specified fields from a JSON response body. Registered as a Spring bean; used in route definitions by name.
package com.example.bff.filter;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.node.ObjectNode;
import org.springframework.cloud.gateway.filter.GatewayFilter;
import org.springframework.cloud.gateway.filter.factory.AbstractGatewayFilterFactory;
import org.springframework.cloud.gateway.filter.factory.rewrite.ModifyResponseBodyGatewayFilterFactory;
import org.springframework.core.io.buffer.DataBuffer;
import org.springframework.http.HttpHeaders;
import org.springframework.http.MediaType;
import org.springframework.http.server.reactive.ServerHttpResponseDecorator;
import org.springframework.stereotype.Component;
import org.springframework.web.server.ServerWebExchange;
import reactor.core.publisher.Flux;
import reactor.core.publisher.Mono;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;
import java.util.List;
/**
* GatewayFilterFactory that removes specified JSON fields from the downstream response body.
*
* Usage in application.yml:
* filters:
* - name: StripResponseFields
* args:
* fields: internalSku,createdBy,updatedBy,auditMetadata
*
* Only applies to responses with Content-Type: application/json.
* Non-JSON responses are passed through unchanged.
*/
@Component
public class StripResponseFieldsGatewayFilterFactory
extends AbstractGatewayFilterFactory<StripResponseFieldsGatewayFilterFactory.Config> {
private final ObjectMapper objectMapper;
public StripResponseFieldsGatewayFilterFactory(ObjectMapper objectMapper) {
super(Config.class);
this.objectMapper = objectMapper;
}
@Override
public List<String> shortcutFieldOrder() {
// Allows short form in YAML: - StripResponseFields=field1,field2
return List.of("fields");
}
@Override
public GatewayFilter apply(Config config) {
return (exchange, chain) -> {
ServerHttpResponseDecorator decorator =
new FieldStrippingResponseDecorator(exchange, config.getFieldList(), objectMapper);
return chain.filter(exchange.mutate().response(decorator).build());
};
}
// --- Response decorator that buffers and rewrites the body ---
private static class FieldStrippingResponseDecorator extends ServerHttpResponseDecorator {
private final ServerWebExchange exchange;
private final List<String> fieldsToStrip;
private final ObjectMapper objectMapper;
FieldStrippingResponseDecorator(ServerWebExchange exchange,
List<String> fieldsToStrip,
ObjectMapper objectMapper) {
super(exchange.getResponse());
this.exchange = exchange;
this.fieldsToStrip = fieldsToStrip;
this.objectMapper = objectMapper;
}
@Override
public Mono<Void> writeWith(org.reactivestreams.Publisher<? extends DataBuffer> body) {
HttpHeaders headers = getDelegate().getHeaders();
MediaType contentType = headers.getContentType();
// Only transform JSON responses
if (contentType == null || !contentType.isCompatibleWith(MediaType.APPLICATION_JSON)) {
return super.writeWith(body);
}
Flux<? extends DataBuffer> flux = Flux.from(body);
Mono<Void> transformedBody = flux
.collectList()
.flatMap(dataBuffers -> {
// Collect all chunks into a single byte array
int totalSize = dataBuffers.stream()
.mapToInt(DataBuffer::readableByteCount).sum();
byte[] bytes = new byte[totalSize];
int offset = 0;
for (DataBuffer buffer : dataBuffers) {
int readable = buffer.readableByteCount();
buffer.read(bytes, offset, readable);
offset += readable;
// Release each buffer to prevent memory leak
org.springframework.core.io.buffer.DataBufferUtils.release(buffer);
}
// Parse → strip fields → re-serialise
byte[] modifiedBytes = stripFields(bytes, fieldsToStrip);
// Update Content-Length to match modified body
getDelegate().getHeaders().setContentLength(modifiedBytes.length);
DataBuffer outputBuffer = exchange.getResponse().bufferFactory()
.wrap(modifiedBytes);
return getDelegate().writeWith(Mono.just(outputBuffer));
});
return transformedBody;
}
private byte[] stripFields(byte[] original, List<String> fields) {
try {
JsonNode root = objectMapper.readTree(original);
if (root instanceof ObjectNode objectNode) {
fields.forEach(objectNode::remove);
}
return objectMapper.writeValueAsBytes(root);
} catch (Exception e) {
// If JSON parsing fails, return original unchanged
return original;
}
}
}
// --- Configuration class ---
public static class Config {
private String fields = ""; // comma-separated field names
public String getFields() { return fields; }
public void setFields(String fields) { this.fields = fields; }
public List<String> getFieldList() {
return Arrays.stream(fields.split(","))
.map(String::trim)
.filter(s -> !s.isEmpty())
.toList();
}
}
}Usage in application.yml:
spring:
cloud:
gateway:
routes:
- id: product-service-angular
uri: lb://product-service
predicates:
- Path=/api/products/**
- Header=X-Client-Type, angular
filters:
- StripPrefix=1
- name: StripResponseFields
args:
fields: internalSku,createdBy,updatedBy,auditMetadata,supplierId,warehouseRegionExample 3: Error Normalisation
Global exception handler that normalises all BFF errors into one consistent envelope. Handles WebClientResponseException, TimeoutException, CircuitBreakerOpenException, and custom BFF exceptions.
package com.example.bff.error;
import java.time.Instant;
/**
* Normalised error envelope returned by the BFF for all error conditions.
* Clients (Angular, mobile) parse only this structure — never raw downstream errors.
*/
public record ErrorResponse(
String code,
String message,
String traceId,
String timestamp,
String path
) {
public static ErrorResponse of(String code, String message, String traceId, String path) {
return new ErrorResponse(code, message, traceId, Instant.now().toString(), path);
}
}package com.example.bff.error;
import io.github.resilience4j.circuitbreaker.CallNotPermittedException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.http.HttpStatus;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.ExceptionHandler;
import org.springframework.web.bind.annotation.RestControllerAdvice;
import org.springframework.web.reactive.function.client.WebClientResponseException;
import org.springframework.web.server.ServerWebExchange;
import reactor.core.publisher.Mono;
import java.util.UUID;
import java.util.concurrent.TimeoutException;
/**
* Centralized exception handler for the BFF.
* Translates all exception types into the normalised ErrorResponse envelope.
*
* Applied to all @RestController classes in the BFF application.
*/
@RestControllerAdvice
public class GlobalExceptionHandler {
private static final Logger log = LoggerFactory.getLogger(GlobalExceptionHandler.class);
// --- 404: Downstream resource not found ---
@ExceptionHandler(ResourceNotFoundException.class)
public Mono<ResponseEntity<ErrorResponse>> handleNotFound(
ResourceNotFoundException ex, ServerWebExchange exchange) {
String traceId = extractTraceId(exchange);
log.warn("Resource not found: {} [traceId={}]", ex.getMessage(), traceId);
return Mono.just(ResponseEntity
.status(HttpStatus.NOT_FOUND)
.body(ErrorResponse.of(ex.getCode(), ex.getMessage(), traceId, path(exchange))));
}
// --- 503: Downstream service unavailable (WebClient got 5xx) ---
@ExceptionHandler(WebClientResponseException.ServiceUnavailable.class)
public Mono<ResponseEntity<ErrorResponse>> handleServiceUnavailable(
WebClientResponseException.ServiceUnavailable ex, ServerWebExchange exchange) {
String traceId = extractTraceId(exchange);
log.error("Downstream service unavailable [traceId={}]: {}", traceId, ex.getMessage());
return Mono.just(ResponseEntity
.status(HttpStatus.SERVICE_UNAVAILABLE)
.body(ErrorResponse.of(
"DOWNSTREAM_UNAVAILABLE",
"A required service is temporarily unavailable. Please retry in a moment.",
traceId,
path(exchange)
)));
}
// --- 502: Bad Gateway (downstream returned unexpected response) ---
@ExceptionHandler(WebClientResponseException.class)
public Mono<ResponseEntity<ErrorResponse>> handleWebClientError(
WebClientResponseException ex, ServerWebExchange exchange) {
String traceId = extractTraceId(exchange);
log.error("Downstream HTTP error {} [traceId={}]: {}",
ex.getStatusCode(), traceId, ex.getResponseBodyAsString());
// 4xx from downstream = client sent bad data
if (ex.getStatusCode().is4xxClientError()) {
return Mono.just(ResponseEntity
.status(ex.getStatusCode())
.body(ErrorResponse.of(
"CLIENT_ERROR",
"Invalid request: " + ex.getMessage(),
traceId,
path(exchange)
)));
}
// 5xx from downstream = treat as BFF 502
return Mono.just(ResponseEntity
.status(HttpStatus.BAD_GATEWAY)
.body(ErrorResponse.of(
"DOWNSTREAM_ERROR",
"An upstream service returned an unexpected error.",
traceId,
path(exchange)
)));
}
// --- 504: Gateway Timeout ---
@ExceptionHandler(TimeoutException.class)
public Mono<ResponseEntity<ErrorResponse>> handleTimeout(
TimeoutException ex, ServerWebExchange exchange) {
String traceId = extractTraceId(exchange);
log.error("Upstream timeout [traceId={}]: {}", traceId, ex.getMessage());
return Mono.just(ResponseEntity
.status(HttpStatus.GATEWAY_TIMEOUT)
.body(ErrorResponse.of(
"UPSTREAM_TIMEOUT",
"The request timed out waiting for a downstream service.",
traceId,
path(exchange)
)));
}
// --- 503: Circuit breaker open (Resilience4j) ---
@ExceptionHandler(CallNotPermittedException.class)
public Mono<ResponseEntity<ErrorResponse>> handleCircuitBreakerOpen(
CallNotPermittedException ex, ServerWebExchange exchange) {
String traceId = extractTraceId(exchange);
log.warn("Circuit breaker open, call rejected [traceId={}]: {}", traceId, ex.getMessage());
return Mono.just(ResponseEntity
.status(HttpStatus.SERVICE_UNAVAILABLE)
.header("Retry-After", "30")
.body(ErrorResponse.of(
"CIRCUIT_OPEN",
"Service is temporarily unavailable due to repeated failures. Try again shortly.",
traceId,
path(exchange)
)));
}
// --- 400: Validation errors ---
@ExceptionHandler(org.springframework.web.bind.support.WebExchangeBindException.class)
public Mono<ResponseEntity<ErrorResponse>> handleValidation(
org.springframework.web.bind.support.WebExchangeBindException ex,
ServerWebExchange exchange) {
String traceId = extractTraceId(exchange);
String detail = ex.getBindingResult().getFieldErrors().stream()
.map(fe -> fe.getField() + ": " + fe.getDefaultMessage())
.findFirst()
.orElse("Validation failed");
return Mono.just(ResponseEntity
.status(HttpStatus.BAD_REQUEST)
.body(ErrorResponse.of("VALIDATION_ERROR", detail, traceId, path(exchange))));
}
// --- 500: Catch-all for unexpected errors ---
@ExceptionHandler(Exception.class)
public Mono<ResponseEntity<ErrorResponse>> handleUnexpected(
Exception ex, ServerWebExchange exchange) {
String traceId = extractTraceId(exchange);
log.error("Unhandled exception [traceId={}]", traceId, ex);
return Mono.just(ResponseEntity
.status(HttpStatus.INTERNAL_SERVER_ERROR)
.body(ErrorResponse.of(
"INTERNAL_ERROR",
"An unexpected error occurred. Please contact support with trace ID: " + traceId,
traceId,
path(exchange)
)));
}
// --- Helpers ---
private String extractTraceId(ServerWebExchange exchange) {
Object stored = exchange.getAttributes().get("X-Correlation-Id");
if (stored instanceof String s && !s.isBlank()) return s;
String fromHeader = exchange.getRequest().getHeaders().getFirst("X-Correlation-Id");
return (fromHeader != null && !fromHeader.isBlank())
? fromHeader
: UUID.randomUUID().toString();
}
private String path(ServerWebExchange exchange) {
return exchange.getRequest().getPath().value();
}
}package com.example.bff.error;
/**
* BFF-level exception for resources not found in downstream services.
* Thrown in reactive chains via onErrorMap; caught by GlobalExceptionHandler.
*/
public class ResourceNotFoundException extends RuntimeException {
private final String code;
public ResourceNotFoundException(String code, String message) {
super(message);
this.code = code;
}
public String getCode() { return code; }
public static ResourceNotFoundException order(String orderId) {
return new ResourceNotFoundException("ORDER_NOT_FOUND",
"Order with ID " + orderId + " was not found");
}
public static ResourceNotFoundException user(String userId) {
return new ResourceNotFoundException("USER_NOT_FOUND",
"User with ID " + userId + " was not found");
}
}Using onErrorMap in reactive chains to normalise at the call site:
public Mono<Order> getOrder(String orderId, String token) {
return orderClient.get()
.uri("/orders/{id}", orderId)
.headers(h -> h.setBearerAuth(token))
.retrieve()
.bodyToMono(Order.class)
// Translate downstream 404 → BFF ResourceNotFoundException
.onErrorMap(WebClientResponseException.NotFound.class,
ex -> ResourceNotFoundException.order(orderId))
// Translate timeout → propagate as-is (GlobalExceptionHandler catches TimeoutException)
.timeout(Duration.ofMillis(500));
// GlobalExceptionHandler catches TimeoutException, WebClientResponseException, etc.
}Example 4: Complete application.yml
Full configuration for a BFF serving /api/users/**, /api/orders/**, /api/products/** with timeout, retry, and circuit breaker for each route.
spring:
application:
name: bff-gateway
cloud:
gateway:
# Global CORS — apply to all routes
globalcors:
corsConfigurations:
'[/**]':
allowedOrigins:
- "http://localhost:4200" # Angular dev server
- "https://app.example.com" # production Angular SPA
allowedMethods: [GET, POST, PUT, DELETE, PATCH, OPTIONS]
allowedHeaders: ["*"]
allowCredentials: true
maxAge: 3600
# Default filters applied to every route unless overridden
default-filters:
- DedupeResponseHeader=Access-Control-Allow-Credentials Access-Control-Allow-Origin
- name: RequestSize
args:
maxSize: 5MB
routes:
# ── USER SERVICE ─────────────────────────────────────────────────────
- id: user-service
uri: lb://user-service
predicates:
- Path=/api/users/**
filters:
- StripPrefix=1
- TokenRelay=
- name: RequestHeaderSize
args:
maxSize: 16KB
# Timeout: entire route (including retries) must complete in 10s
- name: RequestTimeout
args:
timeout: 10000 # milliseconds
# Retry: only on GET, only on gateway-level errors
- name: Retry
args:
retries: 3
statuses: BAD_GATEWAY,SERVICE_UNAVAILABLE,GATEWAY_TIMEOUT
methods: GET,HEAD
backoff:
firstBackoff: 10ms
maxBackoff: 50ms
factor: 2
basedOnPreviousValue: false
# Circuit breaker: open after 50% failure rate
- name: CircuitBreaker
args:
name: userServiceCB
fallbackUri: forward:/fallback/users
# ── ORDER SERVICE ─────────────────────────────────────────────────────
- id: order-service
uri: lb://order-service
predicates:
- Path=/api/orders/**
filters:
- StripPrefix=1
- TokenRelay=
- name: RequestTimeout
args:
timeout: 8000
# Note: Retry NOT applied to orders (POST creates orders — not safe to retry)
# Retry only on GET /api/orders/** read paths via a separate route definition:
- name: CircuitBreaker
args:
name: orderServiceCB
fallbackUri: forward:/fallback/orders
# Safe read-only order queries — retry allowed
- id: order-service-reads
uri: lb://order-service
predicates:
- Path=/api/orders/**
- Method=GET,HEAD
filters:
- StripPrefix=1
- TokenRelay=
- name: RequestTimeout
args:
timeout: 5000
- name: Retry
args:
retries: 2
statuses: BAD_GATEWAY,SERVICE_UNAVAILABLE
methods: GET,HEAD
backoff:
firstBackoff: 20ms
maxBackoff: 100ms
factor: 2
- name: CircuitBreaker
args:
name: orderServiceCB
fallbackUri: forward:/fallback/orders
# ── PRODUCT SERVICE ───────────────────────────────────────────────────
- id: product-service
uri: lb://product-service
predicates:
- Path=/api/products/**
filters:
- StripPrefix=1
- TokenRelay=
# Strip internal fields before returning to Angular
- name: StripResponseFields
args:
fields: internalSku,createdBy,updatedBy,supplierId,warehouseRegion
- name: RequestTimeout
args:
timeout: 5000
- name: Retry
args:
retries: 3
statuses: BAD_GATEWAY,SERVICE_UNAVAILABLE,GATEWAY_TIMEOUT
methods: GET,HEAD
backoff:
firstBackoff: 10ms
maxBackoff: 50ms
factor: 2
- name: CircuitBreaker
args:
name: productServiceCB
fallbackUri: forward:/fallback/products
# ── DASHBOARD AGGREGATION ─────────────────────────────────────────────
# Handled by local DashboardController — no proxy needed
# SCG still handles auth and CORS for this path
- id: dashboard
uri: http://localhost:${server.port}
predicates:
- Path=/api/dashboard/**
filters:
- TokenRelay=
# OAuth2 Resource Server — validate JWTs
security:
oauth2:
resourceserver:
jwt:
issuer-uri: https://idp.example.com
jwk-set-uri: https://idp.example.com/.well-known/jwks.json
# ── RESILIENCE4J CIRCUIT BREAKERS ─────────────────────────────────────────────
resilience4j:
circuitbreaker:
configs:
# Shared default configuration for all circuit breakers
default:
registerHealthIndicator: true
slidingWindowType: COUNT_BASED
slidingWindowSize: 10
minimumNumberOfCalls: 5
failureRateThreshold: 50
slowCallRateThreshold: 80
slowCallDurationThreshold: 3s
waitDurationInOpenState: 30s
permittedNumberOfCallsInHalfOpenState: 3
automaticTransitionFromOpenToHalfOpenEnabled: true
instances:
userServiceCB:
baseConfig: default
# User service is critical — be more conservative
failureRateThreshold: 40
waitDurationInOpenState: 20s
orderServiceCB:
baseConfig: default
# Orders tolerate slightly more failures before opening
failureRateThreshold: 60
waitDurationInOpenState: 45s
permittedNumberOfCallsInHalfOpenState: 5
productServiceCB:
baseConfig: default
# Products are read-heavy; open quickly on slow calls
slowCallRateThreshold: 60
slowCallDurationThreshold: 2s
# Retry configuration (programmatic use in reactive chains)
retry:
instances:
userServiceRetry:
maxAttempts: 3
waitDuration: 100ms
enableExponentialBackoff: true
exponentialBackoffMultiplier: 2
retryExceptions:
- java.util.concurrent.TimeoutException
- org.springframework.web.reactive.function.client.WebClientResponseException$ServiceUnavailable
ignoreExceptions:
- org.springframework.web.reactive.function.client.WebClientResponseException$NotFound
- org.springframework.web.reactive.function.client.WebClientResponseException$BadRequest
- com.example.bff.error.ResourceNotFoundException
# ── SERVER ────────────────────────────────────────────────────────────────────
server:
port: 8080
# Netty connection pool tuning for high-concurrency BFF
netty:
connection-timeout: 2s
idle-timeout: 30s
# ── WEBCLIENT ─────────────────────────────────────────────────────────────────
spring.webflux.base-path: ""
# ── ACTUATOR ──────────────────────────────────────────────────────────────────
management:
endpoints:
web:
exposure:
include: health,info,prometheus,gateway,circuitbreakers,metrics
endpoint:
health:
show-details: always
show-components: always
gateway:
enabled: true
health:
circuitbreakers:
enabled: true
ratelimiters:
enabled: true
metrics:
tags:
application: ${spring.application.name}
environment: ${ENVIRONMENT:local}
# ── LOGGING ───────────────────────────────────────────────────────────────────
logging:
level:
root: INFO
com.example.bff: DEBUG
org.springframework.cloud.gateway: INFO # DEBUG for route tracing in dev
io.github.resilience4j: INFO
reactor.netty.http.client: INFO # DEBUG shows full request/response
pattern:
console: "%d{ISO8601} [%X{traceId:-no-trace}] %-5level %logger{36} - %msg%n"Contradictions and Open Questions
-
gateway-mvcvsgatewayfor aggregation: Spring Cloud 2024.0.x shipsspring-cloud-starter-gateway-mvc(servlet-based). It supports virtual threads (Java 21) for blocking fan-out, which could simplify the reactive learning curve. However, it lacks SCG's built-in filter ecosystem (CircuitBreaker filter, RequestRateLimiter). Decision depends on team reactive fluency. - Distributed circuit breaker state: In a Kubernetes deployment with 3+ BFF pods, each pod has independent Resilience4j state. A brief cascade failure may open the circuit on pod 1 but not pod 2. Whether this matters depends on traffic volume and failure pattern. Redis-backed state is possible but adds operational complexity.
-
ModifyResponseBodyGatewayFilterFactoryand large payloads: The built-in factory buffers the full response body in memory before transformation. For large product catalogue responses (>1MB), this is a memory concern. Alternative: accept the memory cost for clarity, or stream-through and transform in controller code. - REST → gRPC protocol translation at scale: The gRPC channel is long-lived; connection management (keepalive, reconnect) must be configured explicitly on
ManagedChannel. Phase 4 should cover production gRPC channel lifecycle.
Synthesis
Phase 3 reveals that BFF implementation falls into seven distinct patterns, each with a clear Spring primitive:
| Pattern | Primary Primitive | Fallback Strategy |
|---|---|---|
| Request Aggregation | Mono.zip / Flux.flatMap | onErrorResume per branch → empty/default |
| Response Transformation | DTO projection record / @JsonView | Pass-through if transform fails |
| Protocol Translation | WebClient → gRPC stub / Kafka template | 202 Accepted + correlation ID |
| Field Filtering | Explicit DTO / StripResponseFields filter | All-fields pass-through |
| Error Normalisation | GlobalExceptionHandler + onErrorMap | ErrorResponse envelope |
| Timeout & Retry | SCG RequestTimeout/Retry filters + .timeout() | Fast-fail 504 / 503 |
| Circuit Breaker | Resilience4j CircuitBreaker filter + ReactiveCircuitBreakerFactory | Fallback controller |
The most critical implementation rule: apply onErrorResume on each branch of a Mono.zip before combining. An unguarded branch that errors will cancel all branches and return a 500, destroying the partial-failure resilience the BFF is designed to provide.
The second critical rule: never retry non-idempotent methods (POST, PUT, DELETE). The SCG Retry filter's methods argument must explicitly list only GET,HEAD to prevent duplicate writes to downstream services.
Action Items
- Create Token-Relay-Pattern topic note (Phase 4)
- Create Rate-Limiting-BFF topic note (Phase 4) — Redis-backed
RequestRateLimiter - Write integration tests using
WebTestClient+ WireMock for aggregation service - Benchmark
Mono.zipfan-out vs sequential under 500 concurrent connections - Evaluate
gateway-mvc+ virtual threads for teams with no reactive experience
Sources
| Source | Type | Quality | Notes |
|---|---|---|---|
| Spring Cloud Gateway reference docs (4.2.x) | Official | HIGH | Filter factory APIs, YAML schema |
| Resilience4j documentation (resilience4j.readme.io) | Official | HIGH | Circuit breaker, retry configuration |
| Spring Cloud CircuitBreaker reference | Official | HIGH | ReactiveCircuitBreakerFactory, factory auto-config |
| Project Reactor reference (projectreactor.io) | Official | HIGH | Mono.zip, Flux.flatMap, error operators |
| Spring Boot 3.4.x reference docs | Official | HIGH | @RestControllerAdvice, Jackson config |
| Baeldung: Spring Cloud Gateway Modify Response Body | Tutorial | MEDIUM | ModifyResponseBodyGatewayFilterFactory patterns |
| resilience4j/resilience4j GitHub issues | Community | MEDIUM | Edge cases: half-open state, distributed state |
| Martin Fowler: CircuitBreaker (martinfowler.com) | Conceptual | HIGH | Canonical pattern description |
Links
Related Topics: BFF-Pattern · Spring-Cloud-Gateway · Project-Reactor · Reactive-Programming · API-Gateway-Pattern New Topic Notes: Request-Aggregation · Response-Transformation · Circuit-Breaker-Pattern · Resilience4j Previous Phase: P2-Spring-Boot-BFF-Stack MOC: BFF Architecture MOC