Designing a Kafka-like message queue — partitioning, consumer groups, offset management, at-least-once vs exactly-once delivery, and where each guarantee breaks down.
Understanding how Kafka works internally makes you a dramatically better user of it. Let me walk through building a simplified distributed message queue — the design decisions reveal exactly why Kafka makes the trade-offs it does.
1@Component
2@RequiredArgsConstructor
3public class OrderEventProducer {
4
5 private final KafkaTemplate<String, OrderEvent> kafkaTemplate;
6
7 public void publish(OrderEvent event) {
8 kafkaTemplate.send("orders", event.getOrderId(), event)
9 .whenComplete((result, ex) -> {
10 if (ex != null) {
11 log.error("Failed to publish order event: {}", event.getOrderId(), ex);
12 // push to retry queue or dead letter topic
13 } else {
14 log.info("Order event sent to partition {} offset {}",
15 result.getRecordMetadata().partition(),
16 result.getRecordMetadata().offset());
17 }
18 });
19 }
20}1@KafkaListener(topics = "orders", groupId = "order-processor",
2 containerFactory = "manualAckFactory")
3public void consume(ConsumerRecord<String, OrderEvent> record,
4 Acknowledgment ack) {
5 try {
6 orderService.process(record.value());
7 ack.acknowledge(); // commit only after successful processing
8 } catch (RetryableException e) {
9 // don't ack — message will be redelivered
10 log.warn("Retryable error, skipping ack for offset {}", record.offset());
11 } catch (Exception e) {
12 log.error("Fatal error, sending to DLT", e);
13 deadLetterTemplate.send("orders.DLT", record.value());
14 ack.acknowledge(); // ack to move past the bad message
15 }
16}At-least-once delivery means your consumers must be idempotent. Use a processed_events table with a UNIQUE constraint on event_id — duplicate processing becomes a no-op instead of a bug.
More in System Design