Does Kafka Guarantee Message Ordering?

Does Kafka Guarantee Message Ordering?

Yes — Kafka guarantees message ordering within a partition. Ordering across multiple partitions is not guaranteed by default.


1. The Problem: Why Message Ordering Matters in Distributed Systems

In a monolithic application, operations execute sequentially within a single JVM — order is implicit. In a distributed, event-driven architecture built on Apache Kafka, that implicit contract disappears. Messages are produced by independent services, transported across a network, and consumed by multiple concurrent threads. Without intentional design, the sequence in which events were produced is not necessarily the sequence in which they are consumed.

Consider a real-world e-commerce order lifecycle:

Event 1 → ORDER_PLACED
Event 2 → PAYMENT_CONFIRMED
Event 3 → ORDER_SHIPPED

If a downstream service consumes ORDER_SHIPPED before PAYMENT_CONFIRMED, it dispatches an order that has not been paid for. If ORDER_PLACED arrives last, the system may attempt to update a record that does not yet exist in its local database. These are not edge cases — they are predictable failure modes in any system that does not enforce ordering at the infrastructure level.

Ordering is critical whenever events have a causal dependency — that is, when the correctness of processing Event N depends on Event N−1 having already been processed. Common examples in production systems include financial transaction ledgers, inventory state machines, user session pipelines, and audit logs.


2. Kafka Core Concepts

Before examining ordering guarantees, it is essential to understand the internal architecture that makes them possible.

What is Apache Kafka?

Apache Kafka is a distributed, fault-tolerant, append-only log system designed for high-throughput, low-latency event streaming. Originally developed at LinkedIn, it is now an Apache Software Foundation project used widely as the backbone of event-driven architectures.

Unlike traditional message queues where messages are deleted upon consumption, Kafka retains messages for a configurable period regardless of whether they have been consumed. This enables multiple consumers to independently replay or process the same stream of events.

Architectural Components

ComponentRole in the System
ProducerThe application that publishes messages to a Kafka topic
TopicA logical stream of related events, identified by name (e.g., order-events)
PartitionA topic is subdivided into one or more partitions — each is an independent, ordered, append-only log
BrokerA Kafka server responsible for storing partitions and serving producers and consumers
ConsumerThe application that reads and processes messages from a topic
Consumer GroupA set of consumer instances that collectively read a topic, with each partition assigned to exactly one member
Message KeyAn optional byte array attached to a message; used to deterministically assign messages to a partition
OffsetAn immutable, monotonically increasing integer that uniquely identifies each message within a partition
Replication FactorThe number of brokers that store a copy of each partition for fault tolerance

What is a Partition, Really?

A partition is the fundamental unit of parallelism and ordering in Kafka. Internally, each partition is stored as a sequence of log segment files on disk. Every message appended to a partition receives the next available offset — this process is atomic and sequential.

Partition 0 (on disk — append-only log):
  [Offset 0] ORDER_PLACED      for user-42  ← written first
  [Offset 1] PAYMENT_CONFIRMED for user-42  ← written second
  [Offset 2] ORDER_SHIPPED     for user-42  ← written third

Because appends are sequential and offsets are immutable, a consumer that reads Partition 0 from Offset 0 will always receive messages in exactly the order they were written. This is Kafka’s core ordering guarantee — and it is absolute within a single partition.

Why Multiple Partitions Break Global Ordering

A topic with multiple partitions distributes its messages across those partitions, potentially across different brokers on different machines. Each partition operates independently — there is no global clock or coordination mechanism between them.

Topic: order-events (3 partitions across 3 brokers)

Partition 0 (Broker A): [Offset 0] ORDER_PLACED      for user-42
Partition 1 (Broker B): [Offset 0] PAYMENT_CONFIRMED for user-42
Partition 2 (Broker C): [Offset 0] ORDER_SHIPPED     for user-42

When a consumer reads all three partitions, it receives one message from each partition in a non-deterministic interleaving. Broker B may respond faster than Broker A due to lower disk I/O or network latency. The consumer has no way to reconstruct the original sequence without additional coordination.

The foundational rule: Kafka guarantees ordering within a partition. It makes no guarantees about ordering across partitions.


3. How Kafka Enforces Ordering — The Partition Key Mechanism

The Role of the Message Key

The message key is the primary mechanism for enforcing ordering across a distributed topic. When a producer sends a message with a key, Kafka applies the default Murmur2 hash function to determine the target partition:

Target Partition = abs(murmur2(keyBytes)) % numberOfPartitions

Because this is a pure, deterministic function, the same key will always hash to the same partition — regardless of which producer instance sends it, which broker is the leader, or how many consumer instances are running.

Hash("user-42") % 3 = Partition 0  ← always, deterministically
Hash("user-99") % 3 = Partition 2  ← always, deterministically

This means all events for user-42 — across their entire lifecycle — are written sequentially to Partition 0 and consumed in that exact order.

What Happens Internally When a Key is Present

Producer sends three events for user-42:

Step 1: Serialise message → compute murmur2("user-42") % 3 = 0
Step 2: Route to Partition 0 leader on Broker A
Step 3: Broker A appends to log at Offset 0, 1, 2 — sequentially
Step 4: Replication occurs to follower brokers
Step 5: Producer receives acknowledgment (based on acks config)
Step 6: Consumer reads Partition 0, receives Offset 0 → 1 → 2 in order

The ordering contract is maintained at every step through the combination of deterministic routing and append-only writes.

What Happens Internally Without a Key

When no key is provided, Kafka’s default partitioner distributes messages using a round-robin or sticky partitioning strategy (sticky partitioning was introduced in Kafka 2.4 to improve batching efficiency). In both strategies, consecutive messages for the same logical entity can land in different partitions:

Message 1 (ORDER_PLACED)      → Partition 0 (Broker A)
Message 2 (PAYMENT_CONFIRMED) → Partition 1 (Broker B)
Message 3 (ORDER_SHIPPED)     → Partition 2 (Broker C)

Since consumers read partitions independently and Broker B may respond faster, the consumer receives PAYMENT_CONFIRMED before ORDER_PLACEDThe causal sequence is broken with no error, no exception, and no warning.


4. The Spring Boot Implementation

Project Domain

We will build an order lifecycle event system. Two Spring Boot services communicate through a Kafka topic:

  • OrderProducerService — publishes OrderEvent messages representing state transitions in an order’s lifecycle
  • OrderConsumerService — reads and processes those events downstream

Domain Model

// OrderEvent.java
package com.example.kafka.model;

import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;

@Data
@NoArgsConstructor
@AllArgsConstructor
public class OrderEvent {

    private String orderId;
    private String userId;
    private String status;      // ORDER_PLACED | PAYMENT_CONFIRMED | ORDER_SHIPPED
    private long   timestamp;

    @Override
    public String toString() {
        return String.format("[orderId=%-8s | userId=%-8s | status=%-20s | ts=%d]",
                orderId, userId, status, timestamp);
    }
}

userId is the field we will use as the Kafka partition key — ensuring all events for the same user are routed to the same partition and processed in sequence.


Maven Dependencies (pom.xml)

<dependencies>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter</artifactId>
    </dependency>

    <!-- Spring Kafka: auto-configuration, KafkaTemplate, @KafkaListener -->
    <dependency>
        <groupId>org.springframework.kafka</groupId>
        <artifactId>spring-kafka</artifactId>
    </dependency>

    <!-- Jackson: JSON serialisation for OrderEvent -->
    <dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
        <artifactId>jackson-databind</artifactId>
    </dependency>

    <dependency>
        <groupId>org.projectlombok</groupId>
        <artifactId>lombok</artifactId>
        <optional>true</optional>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>
</dependencies>

Application Configuration (application.yml)

Every configuration property below has a direct bearing on ordering, reliability, or both. Each is explained inline.

spring:
  kafka:
    bootstrap-servers: localhost:9092

    # ─── PRODUCER CONFIGURATION ─────────────────────────────────────────────────
    producer:
      key-serializer:   org.apache.kafka.common.serialization.StringSerializer
      value-serializer: org.springframework.kafka.support.serializer.JsonSerializer

      properties:
        # IDEMPOTENCE: Kafka assigns each producer a unique Producer ID (PID).
        # Every message carries a per-partition sequence number.
        # If a retry delivers a duplicate, the broker detects it via PID + SeqNo
        # and discards it — preventing both duplicates and out-of-order writes.
        enable.idempotence: true

        # ACKNOWLEDGMENT LEVEL: "all" (equivalent to -1) requires the partition
        # leader AND all in-sync replicas (ISR) to persist the message before
        # returning acknowledgment. This prevents data loss on broker failure
        # and ensures replay-consistent ordering after a failover.
        acks: all

        # IN-FLIGHT REQUESTS: The number of unacknowledged requests the producer
        # can have open simultaneously per broker connection.
        # With idempotence enabled, Kafka safely supports up to 5 in-flight
        # requests while preserving ordering. Without idempotence, this must
        # be set to 1 to avoid reordering on retry.
        max.in.flight.requests.per.connection: 5

        # RETRY POLICY: Retry indefinitely for transient broker or network errors.
        # Combined with idempotence, retries are safe and order-preserving.
        retries: 2147483647
        retry.backoff.ms: 100
        delivery.timeout.ms: 120000   # Overall timeout per send attempt: 2 minutes

    # ─── CONSUMER CONFIGURATION ─────────────────────────────────────────────────
    consumer:
      group-id: order-processing-group
      auto-offset-reset: earliest     # On first start, read from the beginning of the topic
      enable-auto-commit: false        # Disable auto-commit; offsets committed manually after processing

      key-deserializer:   org.apache.kafka.common.serialization.StringDeserializer
      value-deserializer: org.springframework.kafka.support.serializer.JsonDeserializer

      properties:
        spring.json.trusted.packages: "com.example.kafka.model"
        # POLL BATCH SIZE: Limit records fetched per poll. A lower value reduces
        # the window within which out-of-order processing can occur in multi-threaded
        # consumers. Tune based on processing latency per record.
        max.poll.records: 50
        fetch.min.bytes: 1
        fetch.max.wait.ms: 500

    # ─── LISTENER CONFIGURATION ─────────────────────────────────────────────────
    listener:
      # CONCURRENCY: Number of concurrent consumer threads.
      # Each thread is exclusively assigned to one partition.
      # Setting concurrency = number of partitions ensures one thread per partition,
      # which is the optimal configuration for maintaining per-partition ordering.
      concurrency: 3

      # ACK MODE: MANUAL_IMMEDIATE commits the offset synchronously immediately
      # after acknowledgment.acknowledge() is called. This ensures the offset
      # is only advanced after the message has been successfully processed —
      # preventing silent message loss on processing failure.
      ack-mode: MANUAL_IMMEDIATE

# ─── APPLICATION-LEVEL TOPIC CONFIGURATION ──────────────────────────────────────
app:
  kafka:
    topic:
      order-events: order-events
      partitions: 3
      replication-factor: 1    # Set to min(3, broker count) in production

Topic Provisioning

// KafkaTopicConfig.java
package com.example.kafka.config;

import org.apache.kafka.clients.admin.NewTopic;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.kafka.config.TopicBuilder;

@Configuration
public class KafkaTopicConfig {

    @Value("${app.kafka.topic.order-events}")
    private String topicName;

    @Value("${app.kafka.topic.partitions}")
    private int partitions;

    @Value("${app.kafka.topic.replication-factor}")
    private int replicationFactor;

    /**
     * Spring Boot's KafkaAdmin bean (auto-configured) detects NewTopic beans
     * and creates the topic on the broker at application startup if it does
     * not already exist. No manual kafka-topics.sh command is required.
     *
     * IMPORTANT: Never alter the partition count of this topic after deployment.
     * Doing so changes the hash-to-partition mapping for all existing keys,
     * breaking the ordering guarantee for in-flight and future messages.
     */
    @Bean
    public NewTopic orderEventsTopic() {
        return TopicBuilder
                .name(topicName)
                .partitions(partitions)
                .replicas(replicationFactor)
                .build();
    }
}

Programmatic Producer Configuration

The application.yml approach is sufficient for most applications. For teams that prefer to centralise all Kafka configuration in Java (e.g., for easier testing or environment-specific overrides), the equivalent programmatic configuration is:

// KafkaProducerConfig.java
package com.example.kafka.config;

import com.example.kafka.model.OrderEvent;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.common.serialization.StringSerializer;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.kafka.core.DefaultKafkaProducerFactory;
import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.kafka.core.ProducerFactory;
import org.springframework.kafka.support.serializer.JsonSerializer;

import java.util.HashMap;
import java.util.Map;

@Configuration
public class KafkaProducerConfig {

    @Value("${spring.kafka.bootstrap-servers}")
    private String bootstrapServers;

    @Bean
    public ProducerFactory<String, OrderEvent> producerFactory() {
        Map<String, Object> config = new HashMap<>();

        config.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,      bootstrapServers);
        config.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,   StringSerializer.class);
        config.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, JsonSerializer.class);

        // Idempotent delivery: guarantees exactly-once writes per partition,
        // preserving order even in the presence of producer retries.
        config.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG,                      true);
        config.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION,          5);
        config.put(ProducerConfig.ACKS_CONFIG,                                    "all");
        config.put(ProducerConfig.RETRIES_CONFIG,                                 Integer.MAX_VALUE);
        config.put(ProducerConfig.RETRY_BACKOFF_MS_CONFIG,                        100);

        return new DefaultKafkaProducerFactory<>(config);
    }

    @Bean
    public KafkaTemplate<String, OrderEvent> kafkaTemplate() {
        return new KafkaTemplate<>(producerFactory());
    }
}

5. Scenario A — Producing Without a Key (Ordering Violated)

This scenario demonstrates the consequence of omitting the partition key. It is a common oversight in initial implementations and a frequent source of subtle, environment-dependent bugs in production.

// OrderProducerService.java  — illustrating the incorrect approach
package com.example.kafka.producer;

import com.example.kafka.model.OrderEvent;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.kafka.support.SendResult;
import org.springframework.stereotype.Service;

import java.util.concurrent.CompletableFuture;

@Slf4j
@Service
@RequiredArgsConstructor
public class OrderProducerService {

    private final KafkaTemplate<String, OrderEvent> kafkaTemplate;

    @Value("${app.kafka.topic.order-events}")
    private String topic;

    /**
     * INCORRECT: No partition key is specified.
     * Kafka applies its default partitioning strategy (round-robin or sticky),
     * distributing consecutive messages for the same user across different partitions.
     * Since partitions are consumed independently, the consumer receives these
     * messages in an undefined order — determined by broker latency, not
     * business logic.
     */
    public void sendWithoutKey(OrderEvent event) {
        CompletableFuture<SendResult<String, OrderEvent>> future =
                kafkaTemplate.send(topic, event); // no key — third argument omitted

        future.whenComplete((result, ex) -> {
            if (ex == null) {
                log.warn("[NO KEY] Partition: {} | Offset: {} | Status: {}",
                        result.getRecordMetadata().partition(),
                        result.getRecordMetadata().offset(),
                        event.getStatus());
            }
        });
    }
}

Observed behaviour in logs:

# Producer publishes three events for user-42, sequentially:
[NO KEY] Partition: 0 | Offset: 4 | Status: ORDER_PLACED
[NO KEY] Partition: 1 | Offset: 2 | Status: PAYMENT_CONFIRMED
[NO KEY] Partition: 2 | Offset: 6 | Status: ORDER_SHIPPED

# Broker B (Partition 1) responds with lower latency.
# Broker C (Partition 2) responds next.
# Broker A (Partition 0) responds last.

# Consumer receives:
[CONSUMER] Partition 2 | ORDER_SHIPPED     ← processed first  ❌
[CONSUMER] Partition 1 | PAYMENT_CONFIRMED ← processed second ❌
[CONSUMER] Partition 0 | ORDER_PLACED      ← processed last   ❌

The order in which the consumer processes these events is governed entirely by broker-level timing — not by the application’s intent. This failure mode is silent: no exception is thrown, no alert fires, and the system appears to function correctly until a downstream service acts on the out-of-order data.


6. Scenario B — Producing With a Key (Ordering Enforced)

// OrderProducerService.java  — the correct, production-grade approach
package com.example.kafka.producer;

import com.example.kafka.model.OrderEvent;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.kafka.support.SendResult;
import org.springframework.stereotype.Service;

import java.util.concurrent.CompletableFuture;

@Slf4j
@Service
@RequiredArgsConstructor
public class OrderProducerService {

    private final KafkaTemplate<String, OrderEvent> kafkaTemplate;

    @Value("${app.kafka.topic.order-events}")
    private String topic;

    /**
     * CORRECT: userId is used as the partition key.
     *
     * Kafka applies murmur2(userId) % numPartitions to determine the target
     * partition. Because this is a deterministic function, every event for
     * the same user will always be routed to the same partition, regardless
     * of which producer instance or broker is involved.
     *
     * The broker appends messages to that partition sequentially.
     * The consumer reads that partition sequentially.
     * Per-user ordering is therefore guaranteed end-to-end.
     */
    public void sendOrderEvent(OrderEvent event) {
        String partitionKey = event.getUserId();

        CompletableFuture<SendResult<String, OrderEvent>> future =
                kafkaTemplate.send(topic, partitionKey, event);

        future.whenComplete((result, ex) -> {
            if (ex == null) {
                log.info("[SENT] key={} | Partition: {} | Offset: {} | Status: {}",
                        partitionKey,
                        result.getRecordMetadata().partition(),
                        result.getRecordMetadata().offset(),
                        event.getStatus());
            } else {
                log.error("[FAILED] key={} | Status: {} | Reason: {}",
                        partitionKey, event.getStatus(), ex.getMessage());
            }
        });
    }
}

REST Controller to Simulate Concurrent Order Pipelines

// OrderController.java
package com.example.kafka.controller;

import com.example.kafka.model.OrderEvent;
import com.example.kafka.producer.OrderProducerService;
import lombok.RequiredArgsConstructor;
import org.springframework.web.bind.annotation.*;

@RestController
@RequestMapping("/orders")
@RequiredArgsConstructor
public class OrderController {

    private final OrderProducerService producerService;

    /**
     * POST /orders/simulate
     *
     * Publishes a complete, interleaved order lifecycle for two users.
     * The two pipelines (user-42 and user-99) run concurrently in production —
     * this simulation interleaves their events to demonstrate that Kafka routes
     * each user's events to a distinct partition, preserving per-user ordering
     * while processing both pipelines in parallel.
     */
    @PostMapping("/simulate")
    public String simulateOrders() {

        // User A — events published in causal order
        producerService.sendOrderEvent(
            new OrderEvent("ORD-001", "user-42", "ORDER_PLACED",      System.nanoTime()));
        producerService.sendOrderEvent(
            new OrderEvent("ORD-001", "user-42", "PAYMENT_CONFIRMED", System.nanoTime()));
        producerService.sendOrderEvent(
            new OrderEvent("ORD-001", "user-42", "ORDER_SHIPPED",     System.nanoTime()));

        // User B — events interleaved with User A
        producerService.sendOrderEvent(
            new OrderEvent("ORD-002", "user-99", "ORDER_PLACED",      System.nanoTime()));
        producerService.sendOrderEvent(
            new OrderEvent("ORD-002", "user-99", "PAYMENT_CONFIRMED", System.nanoTime()));
        producerService.sendOrderEvent(
            new OrderEvent("ORD-002", "user-99", "ORDER_SHIPPED",     System.nanoTime()));

        return "Order events published successfully.";
    }
}

Observed behaviour in logs:

# Kafka's deterministic routing:
# murmur2("user-42") % 3 = Partition 0
# murmur2("user-99") % 3 = Partition 2

[SENT] key=user-42 | Partition: 0 | Offset: 0 | Status: ORDER_PLACED
[SENT] key=user-99 | Partition: 2 | Offset: 0 | Status: ORDER_PLACED
[SENT] key=user-42 | Partition: 0 | Offset: 1 | Status: PAYMENT_CONFIRMED
[SENT] key=user-99 | Partition: 2 | Offset: 1 | Status: PAYMENT_CONFIRMED
[SENT] key=user-42 | Partition: 0 | Offset: 2 | Status: ORDER_SHIPPED
[SENT] key=user-99 | Partition: 2 | Offset: 2 | Status: ORDER_SHIPPED

# Consumer Thread 1 reads Partition 0 (user-42):
[CONSUMER P-0] Offset 0 → ORDER_PLACED       ✅
[CONSUMER P-0] Offset 1 → PAYMENT_CONFIRMED  ✅
[CONSUMER P-0] Offset 2 → ORDER_SHIPPED      ✅

# Consumer Thread 3 reads Partition 2 (user-99):
[CONSUMER P-2] Offset 0 → ORDER_PLACED       ✅
[CONSUMER P-2] Offset 1 → PAYMENT_CONFIRMED  ✅
[CONSUMER P-2] Offset 2 → ORDER_SHIPPED      ✅

Both users are processed in parallel, with strict per-user event ordering maintained throughout. ✅


7. Consumer Service — Reliable, Ordered Message Processing

// OrderConsumerService.java
package com.example.kafka.consumer;

import com.example.kafka.model.OrderEvent;
import lombok.extern.slf4j.Slf4j;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.springframework.kafka.annotation.KafkaListener;
import org.springframework.kafka.support.Acknowledgment;
import org.springframework.stereotype.Service;

@Slf4j
@Service
public class OrderConsumerService {

    /**
     * concurrency = "3" instructs Spring Kafka to create three consumer threads,
     * each exclusively assigned to one partition. This is the correct
     * configuration for a 3-partition topic — it ensures no two threads contend
     * for the same partition and that messages within each partition are processed
     * strictly in offset order.
     *
     * The ConsumerRecord wrapper gives access to both the partition key
     * (record.key()) and Kafka metadata (partition, offset) — essential for
     * structured logging and debugging in production environments.
     */
    @KafkaListener(
        topics      = "${app.kafka.topic.order-events}",
        groupId     = "order-processing-group",
        concurrency = "3"
    )
    public void consume(ConsumerRecord<String, OrderEvent> record,
                        Acknowledgment acknowledgment) {

        String     userId = record.key();
        OrderEvent event  = record.value();
        int        part   = record.partition();
        long       offset = record.offset();

        log.info("[CONSUMER] Partition: {} | Offset: {} | userId: {} | Status: {}",
                part, offset, userId, event.getStatus());

        processOrderEvent(event);

        /**
         * Manual offset commitment: the offset is advanced only after
         * processOrderEvent() returns successfully. If an exception is thrown,
         * this line is never reached — Kafka will re-deliver the message on
         * the next poll, ensuring no event is silently skipped due to a
         * processing failure.
         */
        acknowledgment.acknowledge();
    }

    private void processOrderEvent(OrderEvent event) {
        switch (event.getStatus()) {
            case "ORDER_PLACED"       -> log.info("Order initialised: {}", event.getOrderId());
            case "PAYMENT_CONFIRMED"  -> log.info("Payment verified for order: {}", event.getOrderId());
            case "ORDER_SHIPPED"      -> log.info("Fulfilment triggered for order: {}", event.getOrderId());
            default                   -> log.warn("Unrecognised status '{}' for order: {}",
                                                   event.getStatus(), event.getOrderId());
        }
    }
}

8. Idempotent Producer — Ordering Guarantees Under Failure Conditions

The Problem: Retries and Reordering

A producer retry is not inherently safe. Consider the following failure scenario without idempotence:

Step 1: Producer sends Batch A [PAYMENT_CONFIRMED] to broker.
Step 2: Broker writes Batch A. Network drops before acknowledgment.
Step 3: Producer does not receive ack. Timeout fires → producer retries Batch A.
Step 4: Meanwhile, producer sends Batch B [ORDER_SHIPPED].
Step 5: Batch B is acknowledged first.
Step 6: Retry of Batch A arrives and is written after Batch B.

Result: Partition log = [ORDER_SHIPPED (Offset 1), PAYMENT_CONFIRMED (Offset 2)]
Consumer reads: ORDER_SHIPPED before PAYMENT_CONFIRMED ❌

The Solution: Producer ID and Sequence Numbers

When enable.idempotence = true, the Kafka broker assigns each producer session a unique Producer ID (PID). The producer attaches a monotonically increasing sequence number to every message it sends, scoped to a specific (PID, Partition) pair.

Producer PID = 101

Message 1: PID=101, Partition=0, SeqNo=0, Status=ORDER_PLACED
Message 2: PID=101, Partition=0, SeqNo=1, Status=PAYMENT_CONFIRMED
Message 3: PID=101, Partition=0, SeqNo=2, Status=ORDER_SHIPPED

The broker maintains the last accepted sequence number per (PID, Partition). Upon receiving a message:

  • If SeqNo == lastAccepted + 1 → accept and write.
  • If SeqNo <= lastAccepted → duplicate detected, silently discard.
  • If SeqNo > lastAccepted + 1 → sequence gap detected, return an error (OutOfOrderSequenceException).
Retry scenario with idempotence:

Producer sends PAYMENT_CONFIRMED with PID=101, SeqNo=1.
Network drops. Producer retries: PAYMENT_CONFIRMED, PID=101, SeqNo=1.
Broker: "SeqNo 1 from PID 101 already written." → Discard. ✅
Ordering preserved. No duplicate in the log. ✅

This is already configured in our application.yml and KafkaProducerConfig.java. No application code changes are required.


9. Custom Partitioner — Domain-Driven Partition Assignment

The default key-hash strategy distributes messages uniformly across partitions. In some production scenarios, business requirements demand more targeted routing — for example, isolating high-value transactions to a dedicated partition for priority processing or SLA compliance.

// PriorityOrderPartitioner.java
package com.example.kafka.config;

import com.fasterxml.jackson.databind.ObjectMapper;
import org.apache.kafka.clients.producer.Partitioner;
import org.apache.kafka.common.Cluster;

import java.util.Map;

/**
 * A domain-aware partitioner that routes messages based on business attributes
 * of the event payload, rather than solely on the message key.
 *
 * Routing strategy:
 *   - Orders with amount > 10,000 → Partition 0 (dedicated high-value lane)
 *   - All other orders            → Partitions 1..N-1 (distributed by key hash)
 *
 * This ensures high-value orders receive dedicated consumer resources and
 * are not delayed by the volume of standard-value transactions.
 */
public class PriorityOrderPartitioner implements Partitioner {

    private final ObjectMapper objectMapper = new ObjectMapper();

    @Override
    public int partition(String topic, Object key, byte[] keyBytes,
                         Object value, byte[] valueBytes, Cluster cluster) {

        int totalPartitions = cluster.partitionCountForTopic(topic);

        try {
            Map<?, ?> payload    = objectMapper.readValue(valueBytes, Map.class);
            Object    amountObj  = payload.get("amount");

            if (amountObj != null) {
                double amount = Double.parseDouble(amountObj.toString());
                if (amount > 10_000) {
                    return 0;   // Priority lane: dedicated partition for high-value orders
                }
            }
        } catch (Exception e) {
            // Fall through to default routing on deserialisation failure
        }

        // Standard routing: distribute across non-priority partitions using key hash
        if (keyBytes == null) return 1;
        return (Math.abs(key.hashCode()) % (totalPartitions - 1)) + 1;
    }

    @Override public void close()                           { }
    @Override public void configure(Map<String, ?> configs) { }
}

Register the custom partitioner in application.yml:

spring:
  kafka:
    producer:
      properties:
        partitioner.class: com.example.kafka.config.PriorityOrderPartitioner

10. Common Ordering Failures — Root Cause Analysis

Understanding why ordering breaks is as important as knowing how to enforce it. The following section covers the most frequently encountered failure patterns in production Kafka systems, along with their root causes and recommended remediation.


10.1 Absent Partition Key — Silent Round-Robin Distribution

Root cause: When no key is provided, Kafka’s default partitioner distributes messages across all available partitions using a round-robin or sticky strategy. Logically related messages for the same entity land in different partitions, and the consumer’s read order becomes a function of broker-level latency rather than the intended event sequence.

// Problematic pattern — no key provided:
kafkaTemplate.send(topic, event);           // key = null → non-deterministic routing

// Correct pattern — entity identifier used as key:
kafkaTemplate.send(topic, event.getUserId(), event);   // deterministic, ordered routing

10.2 Partition Count Modification on a Live Topic

Root cause: Kafka’s partition assignment formula is murmur2(key) % numPartitions. If numPartitions changes, the same key evaluates to a different partition. Messages written before the change reside in the old partition; messages written after reside in a new one. Consumers reading both partitions receive an interleaved, out-of-order view of the event history for that key.

# Current state: user-42 → murmur2("user-42") % 3 = Partition 0
kafka-topics.sh --alter --topic order-events --partitions 6

# Post-change: user-42 → murmur2("user-42") % 6 = Partition 3
# Historical events: Partition 0
# New events:        Partition 3
# Consumer sees two independent timelines for the same user ❌

Recommendation: Determine the partition count during initial system design based on projected throughput and consumer parallelism. Over-provisioning (e.g., starting at 12 partitions when 3 are sufficient today) is far less disruptive than modifying partition count after deployment.


10.3 Consumer Concurrency Exceeding Partition Count

Root cause: Each partition can be assigned to at most one consumer within a consumer group. If concurrency exceeds the number of partitions, the surplus consumer threads receive no partition assignment and remain idle for the lifetime of the group. The active threads may experience uneven load, and the idle threads represent wasted resources.

# Misconfiguration — surplus idle threads with no partition assignment:
listener:
  concurrency: 10     # 10 threads, but topic has only 3 partitions
                      # → 7 threads permanently idle

# Correct configuration — one thread per partition:
listener:
  concurrency: 3      # 3 threads, 3 partitions → balanced, no idle resources

10.4 Offset Auto-Commit Causing Message Loss Under Failure

Root cause: When enable-auto-commit: true, Kafka commits the consumer offset on a periodic schedule (default: every 5 seconds), regardless of whether the application has successfully processed the fetched messages. If a processing failure occurs after an auto-commit but before the next poll, those messages are considered consumed and will not be redelivered. This breaks both reliability and, indirectly, ordering — because downstream state derived from later messages may be processed without the context of the lost ones.

// Risk-prone pattern — auto-commit does not guarantee processing completion:
@KafkaListener(topics = "order-events")
public void consume(OrderEvent event) {
    // If an exception is thrown here, the offset may already be committed.
    // The message is silently lost; downstream state becomes inconsistent.
    processOrderEvent(event);
}

// Reliable pattern — offset is committed only after verified processing:
@KafkaListener(topics = "${app.kafka.topic.order-events}")
public void consume(ConsumerRecord<String, OrderEvent> record,
                    Acknowledgment acknowledgment) {
    try {
        processOrderEvent(record.value());
        acknowledgment.acknowledge();   // Offset committed after successful processing
    } catch (Exception ex) {
        log.error("Processing failure for offset {}. Message will be re-delivered.",
                  record.offset(), ex);
        // acknowledgment is not called → Kafka re-delivers on next poll
    }
}

10.5 In-Flight Request Reordering Without Idempotence

Root cause: With max.in.flight.requests.per.connection > 1 and idempotence disabled, a producer may have multiple batches in transit simultaneously. If Batch 1 fails and is retried while Batch 2 has already been acknowledged, Batch 1’s retry is appended after Batch 2 in the partition log — reversing the intended write order.

# Produces ordering risk — multiple in-flight requests without idempotence:
producer:
  properties:
    enable.idempotence: false
    max.in.flight.requests.per.connection: 5   # ← unsafe combination

# Safe option 1 — idempotence ON (recommended; allows up to 5 in-flight):
producer:
  properties:
    enable.idempotence: true
    max.in.flight.requests.per.connection: 5   # ← safe with idempotence

# Safe option 2 — strictly one in-flight (maximum ordering safety, reduced throughput):
producer:
  properties:
    enable.idempotence: false
    max.in.flight.requests.per.connection: 1   # ← safe without idempotence

11. Consumer Group Mechanics and Ordered Parallel Processing

How Kafka Achieves Both Ordering and Parallelism

These two properties appear contradictory: parallelism suggests multiple workers processing simultaneously; ordering requires sequential execution. Kafka resolves this through partition-level isolation.

Each partition within a consumer group is assigned to exactly one consumer instance. That consumer reads the partition’s messages in strict offset order. Multiple consumers — each owning a different partition — execute in parallel without any shared state or coordination overhead.

Topic: order-events — 3 partitions

Consumer Group: order-processing-group
  ├─ Consumer Thread 1 ──► Partition 0 (users whose key hashes to 0: user-42, user-15 ...)
  ├─ Consumer Thread 2 ──► Partition 1 (users whose key hashes to 1: user-88, user-03 ...)
  └─ Consumer Thread 3 ──► Partition 2 (users whose key hashes to 2: user-99, user-21 ...)

Thread 1 processes user-42’s events in order. Thread 3 processes user-99’s events in order. Both run concurrently — there is no contention between them. This is the model that enables Kafka to scale horizontally while preserving per-entity event ordering.

Partition Rebalancing on Consumer Failure

If a consumer instance fails or is restarted, Kafka triggers a rebalance. The group coordinator reassigns the orphaned partition to a surviving consumer. That consumer resumes from the last committed offset — the point at which the previous consumer last called acknowledgment.acknowledge(). No messages are lost or skipped, and the ordering guarantee is maintained across the transition.

This is precisely why manual offset management (MANUAL_IMMEDIATE ack-mode) is critical in ordered-processing scenarios. An auto-committed offset may represent a point beyond what was actually processed, causing gaps in the event history post-rebalance.


12. Ordering Guarantee Reference

ScenarioOrdering Guaranteed?Notes
Single partition, any configuration✅ GlobalThroughput limited to one broker’s write capacity
Multiple partitions + stable entity key✅ Per entityStandard production pattern
Multiple partitions + no key❌ No guaranteeDefault partitioner distributes non-deterministically
Idempotent producer + acks=all✅ Preserved on retryRequired for production reliability
concurrency equals partition count✅ OptimalOne thread per partition; no idle resources
Partition count altered post-deployment❌ BrokenKey-to-partition mapping changes; ordering violated
enable-auto-commit: true⚠️ Loss riskProcessing failure may silently advance offset
max.in.flight > 1 without idempotence❌ Retry reorderingUnsafe; set to 1 or enable idempotence

13. Configuration Reference

PropertyRecommended ValueImpact on Ordering
enable.idempotencetrueEliminates duplicate writes and retry-induced reordering
acksallEnsures all replicas confirm before success; prevents loss on failover
max.in.flight.requests.per.connection5 (with idempotence) / 1 (without)Controls retry-reordering risk
retriesInteger.MAX_VALUEPrevents silent message drop on transient failure
retry.backoff.ms100Avoids aggressive retry storms
delivery.timeout.ms120000Maximum time before a send is considered failed
enable-auto-commitfalsePrevents offset advancement before processing is complete
ack-modeMANUAL_IMMEDIATEOffset committed only after explicit acknowledgment
concurrencyEqual to partition countOne thread per partition; no idle threads
max.poll.records50Limits in-flight records per consumer thread

14. Questions and Answers

Q: Does Kafka guarantee message ordering?

Kafka guarantees message ordering within a partition. Messages assigned to the same partition are written and consumed in strict offset order. Across multiple partitions, there is no global ordering guarantee — consumers read partitions independently and the relative order is determined by broker-level timing.

Q: How would you implement ordered processing in a Spring Boot Kafka application?

Assign a stable entity identifier (such as userId or orderId) as the partition key in kafkaTemplate.send(topic, key, value). Configure the producer with enable.idempotence=true and acks=all. Set concurrency in the @KafkaListener equal to the number of partitions, and use MANUAL_IMMEDIATE ack-mode to ensure offsets are committed only after successful processing.

Q: What is the consequence of not providing a partition key?

Without a key, Kafka uses round-robin or sticky partitioning, distributing consecutive messages for the same entity across different partitions. Since partitions are consumed independently, the consumer receives those messages in an order determined by broker latency — not by the producer’s intent. This failure is silent: no exception is raised and no alert fires.

Q: What is an idempotent producer and how does it relate to ordering?

An idempotent producer is one configured with enable.idempotence=true. Kafka assigns the producer a unique Producer ID (PID) and tracks a sequence number per (PID, partition) pair. If a retry delivers a message with a sequence number the broker has already accepted, the duplicate is discarded without being written. This eliminates both duplicate records and the reordering that can result from retry races.

Q: Why is modifying the partition count of a live topic dangerous?

Partition assignment uses murmur2(key) % numPartitions. Changing numPartitions changes the target partition for every existing key. Historical messages for a key remain in the old partition while new messages go to a different one. Consumers reading both partitions receive an interleaved, causally inconsistent view of the event history.

Q: What is the relationship between concurrency and partition count?

concurrency in a @KafkaListener creates that many consumer threads. Kafka assigns each thread to at most one partition. If concurrency < numPartitions, some partitions are read by the same thread sequentially — reducing parallelism. If concurrency > numPartitions, surplus threads receive no assignment and remain idle. The optimal setting is concurrency = numPartitions.

Q: Why should enable-auto-commit be set to false in ordered-processing systems?

Auto-commit advances the offset on a periodic schedule, regardless of processing outcome. If a processing failure occurs after an offset is auto-committed, the message will not be redelivered — it is permanently lost from the consumer’s perspective. Manual acknowledgment decouples offset advancement from the polling cycle and ties it directly to verified processing success, ensuring that no message is silently skipped and that ordered processing can resume correctly after a failure.


15. Conclusion

Apache Kafka does not make ordering easy by accident — it makes it possible by design. The append-only partition log, deterministic key-based routing, and exclusive partition-to-consumer assignment are all deliberate architectural choices that, when used correctly, deliver strong per-entity ordering guarantees at scale.

In a Spring Boot application, enforcing those guarantees requires five intentional decisions:

  1. Select a meaningful partition key — use a stable entity identifier (userIdorderId) that represents the ordering boundary for your business domain.
  2. Enable idempotent delivery — configure enable.idempotence=true and acks=all to protect ordering and eliminate duplicates under failure conditions.
  3. Align concurrency with partition count — set concurrency equal to the number of partitions so each partition is served by a dedicated thread.
  4. Use manual offset management — configure enable-auto-commit=false and ack-mode=MANUAL_IMMEDIATE so offsets are only advanced after verified processing completion.
  5. Treat partition count as immutable post-deployment — plan partition capacity upfront and avoid alterations on live topics to preserve key-to-partition routing consistency.

Applied together, these decisions ensure that Kafka delivers on its ordering guarantees reliably, predictably, and at production scale.

Post a Comment

Previous Post Next Post