Compensating Transactions

Compensating Transactions

"A compensating transaction is an operation that semantically undoes the effects of another operation." — paraphrased from Chris Richardson, Microservices Patterns, 2018

Intent

Compensating transactions are the failure-handling mechanism shared by both Choreography-Saga-Pattern and Orchestration-Saga-Pattern. When a distributed saga cannot complete all forward steps, compensating transactions restore the system to a semantically consistent state by reversing each completed step in reverse order.

Compensations are NOT rollbacks. A database rollback undoes physical changes atomically and invisibly. A compensating transaction is a new, visible operation that semantically reverses a prior operation — but it cannot undo physical side effects already delivered. A sent email cannot be unsent; a payment already transferred to an external processor cannot be silently reversed. These irreversible operations must be designed around, not compensated away.

Compensation must execute in LIFO order (reverse of the forward steps). If a saga has three steps and step 3 fails, compensation executes in the order: step 2 reversed, then step 1 reversed. Executing in forward order leaves the system in an inconsistent intermediate state. Each compensation action MUST be idempotent — see Idempotent-Consumer — because durable executors (Temporal, Axon) retry failed compensations automatically.

When NOT to Use

  • Two-service flows where a simple retry or timeout suffices — two services talking directly is a distributed call, not a saga; the overhead of explicit compensation design exceeds the benefit
  • Operations that are inherently irreversible with no semantic undo (e.g., external notifications already delivered, emails sent, webhook callbacks fired) — design around these by making them the LAST step of the saga so they only execute after all reversible steps have succeeded
  • When all operations are in a single database — use a local ACID transaction instead; distributed compensation is unnecessary overhead when a single database transaction boundary is available

When to Use

  • Multi-service workflows (3+ services) where partial completion requires coordinated rollback across multiple services
  • Any saga pattern (choreography or orchestration) with forward steps that can fail after making external state changes
  • When forward steps have clear semantic reversals: reserve/release, create/cancel, debit/credit, enqueue/dequeue

How It Works

The compensation stack pattern: each forward step registers its compensation function BEFORE executing (not after). If step N fails, compensations for steps N-1 through 1 execute in LIFO order.

Compensation registration order (using unshift to prepend, building a LIFO stack):

  1. Push compensation for step 1 onto the front of the array
  2. Execute step 1
  3. Push compensation for step 2 onto the front of the array
  4. Execute step 2
  5. If step 3 fails: iterate the compensation array (already in LIFO order) and execute each

Compensation actions must be idempotent because durable executors (Temporal, Axon) retry failed compensation activities. A compensation that executes twice must produce the same result as executing once. Semantic idempotency (UPDATE SET status='available' WHERE status='reserved') is the preferred approach; explicit deduplication via a compensationId is an alternative.

Sequence Diagram

sequenceDiagram
    participant W as Workflow / Orchestrator
    participant S1 as Step 1: Reserve Payment
    participant S2 as Step 2: Reserve Inventory
    participant S3 as Step 3: Ship Order
    participant C2 as Compensate: Release Inventory
    participant C1 as Compensate: Release Payment

    Note over W,C1: Forward execution
    W->>W: Register compensation(releasePayment)
    W->>S1: reservePayment(orderId)
    S1-->>W: SUCCESS

    W->>W: Register compensation(releaseInventory)
    W->>S2: reserveInventory(orderId)
    S2-->>W: SUCCESS

    W->>S3: shipOrder(orderId)
    S3-->>W: FAILURE

    Note over W,C1: Compensation in LIFO order
    rect rgb(255, 230, 230)
        W->>C2: releaseInventory(orderId) [idempotent]
        C2-->>W: reversed step 2
        W->>C1: releasePayment(orderId) [idempotent]
        C1-->>W: reversed step 1
    end

    Note over W: Step 3 has no compensation:<br/>it failed before completing.<br/>Only completed steps are reversed.

Worked Failure Scenario

A 3-step order fulfilment saga: reserve payment, reserve inventory, ship order.

  1. Step 1: reservePayment(orderId) — SUCCESS. Compensation releasePayment registered before execution.
  2. Step 2: reserveInventory(orderId) — SUCCESS. Compensation releaseInventory registered before execution.
  3. Step 3: shipOrder(orderId) — FAILS. No compensation is registered for step 3 (it did not complete; there is nothing to reverse).
  4. Compensation executes in LIFO order:
    • releaseInventory(orderId) — reverses step 2 (the most recent successful step)
    • releasePayment(orderId) — reverses step 1
  5. Final state: inventory reservation released, payment reservation released, order marked as failed.

Note: shipOrder has no compensation because it failed before completing. Compensations only apply to steps that succeeded.

TypeScript Example

// Compensating Transactions — Worked failure scenario (Temporal TypeScript)
// Scenario: step 1 (reservePayment) succeeds, step 2 (reserveInventory) succeeds,
//           step 3 (shipOrder) FAILS → releaseInventory → releasePayment (LIFO)
// Source: temporal.io/blog/compensating-actions-part-of-a-complete-breakfast-with-sagas
import { proxyActivities } from '@temporalio/workflow';
import type * as activities from './activities';
 
const { reservePayment, reserveInventory, shipOrder,
        releasePayment, releaseInventory } = proxyActivities<typeof activities>({
  startToCloseTimeout: '1 minute',
});
 
type Compensation = () => Promise<void>;
 
export async function orderFulfillmentWorkflow(orderId: string): Promise<void> {
  const compensations: Compensation[] = [];
  try {
    compensations.unshift(() => releasePayment(orderId));     // register before executing
    await reservePayment(orderId);                            // step 1: SUCCESS
 
    compensations.unshift(() => releaseInventory(orderId));   // register before executing
    await reserveInventory(orderId);                          // step 2: SUCCESS
 
    await shipOrder(orderId);                                 // step 3: FAILS
  } catch (err) {
    // Compensate in LIFO order: releaseInventory first, then releasePayment
    for (const compensate of compensations) {
      await compensate();  // MUST be idempotent — may be retried by Temporal
      // PRODUCTION: wrap in CancellationScope.nonCancellable() to ensure compensation
      // runs even if the workflow is cancelled externally
    }
    throw err;
  }
}

Java Example

// Compensating Transactions — Axon Saga failure/compensation path
// Scenario: payment reserved (step 1), inventory reservation FAILS (step 2)
//           → ReleasePaymentCommand sent (compensate step 1 in LIFO order)
// Source: docs.axoniq.io/axon-framework-reference/4.10/sagas/implementation/
@Saga
public class OrderFulfillmentSaga {
    @Autowired private transient CommandGateway commandGateway;
 
    @StartSaga
    @SagaEventHandler(associationProperty = "orderId")
    public void on(OrderCreatedEvent event) {
        commandGateway.send(new ReservePaymentCommand(event.getOrderId())); // step 1
    }
 
    @SagaEventHandler(associationProperty = "orderId")
    public void on(PaymentReservedEvent event) {
        commandGateway.send(new ReserveInventoryCommand(event.getOrderId())); // step 2
    }
 
    @SagaEventHandler(associationProperty = "orderId")
    public void on(InventoryReservationFailedEvent event) {
        // step 2 FAILED — compensate step 1 (LIFO): release payment reservation
        // IDEMPOTENT: ReleasePaymentCommand handler uses UPDATE...WHERE status='reserved'
        commandGateway.send(new ReleasePaymentCommand(event.getOrderId()));
        SagaLifecycle.end();
    }
 
    @EndSaga
    @SagaEventHandler(associationProperty = "orderId")
    public void on(OrderFulfilledEvent event) { /* success path — no compensation */ }
}

Non-idempotent compensation anti-pattern: A non-idempotent compensation action causes double-compensation data corruption when retried. Example of the wrong approach: using an unconditional INSERT or UPDATE SET status='available' without a guard condition. The correct approach uses semantic idempotency: UPDATE SET status='available' WHERE status='reserved' — if the payment was already released, this UPDATE matches zero rows and causes no harm. Prefer semantic idempotency or explicit deduplication via compensationId. See Idempotent-Consumer for deduplication strategies.

Lineage Backward

  • Idempotent-Consumer — idempotency is a prerequisite for correct compensation; compensation actions are retried by durable executors
  • Choreography-Saga-Pattern — choreography sagas compensate using domain events (each service publishes a compensating event)
  • Orchestration-Saga-Pattern — orchestration sagas compensate using commands dispatched by the saga orchestrator (CommandGateway)

Lineage Forward

PatternRelationship
Choreography-Saga-PatternUses compensating events to reverse completed saga steps
Orchestration-Saga-PatternUses compensating commands dispatched by the orchestrator
Idempotent-ConsumerPrerequisite: all compensation actions must be idempotent
Dead-Letter-QueueCaptures permanently failed compensation attempts for manual inspection
Domain-EventsCompensating events in choreography sagas are domain events with failure intent
  • Operational-API-Patterns — HTTP idempotency keys are the lineage predecessor to saga compensating transactions; safe retry via idempotency keys is a prerequisite for correct compensation

Sources

  • microservices.io/patterns/data/saga.html
  • temporal.io/blog/compensating-actions-part-of-a-complete-breakfast-with-sagas
  • docs.axoniq.io/axon-framework-reference/4.10/sagas/implementation/
  • Chris Richardson, Microservices Patterns, 2018