Exactly-Once Delivery With Kafka

Exactly-once delivery (processing) in Kafka? It's not as easy as claimed - read this to find out about the pitfalls…

The Problem
- Messages sent to Kafka can be lost if there is a crash or a bug
- Messages received from Kafka can be lost if there is a crash or a bug
The Solution
Limitations

The Problem

There is a lot of enthusiasm concerning Kafka as a high-speed broker for your messages. However, so far Kafka does not support extended transactions. Despite what the documentation suggests, this implies the following:

Messages sent to Kafka can be lost if there is a crash or a bug

Suppose you have your data stored in some regular DBMS (not Kafka). When you update the data, you want to publish an event to Kafka. You can do this as follows:

First update the DBMS (i.e., commit the changes)
Then publish to Kafka

However, if a crash happens in between 1 and 2 then you will have updated the DBMS, but the message for Kafka will have been lost. You may not want this.

Alternatively, let's try the other way around:

Publish a message to Kafka
Then update (commit) the DBMS changes

This does not help: if a crash happens in between 1 and 2 then you will have published a message, but the associated data will have been lost (no commit).

Messages received from Kafka can be lost if there is a crash or a bug

On the other end, there is likely some consumer of the messages you publish. Assuming that the consumers use their own DBMS storage (not Kafka), how do you consume messages reliably? Let's try:

Take a message from Kafka
Update the DBMS
Save the offset of your Kafka messages read so far - so you don't get the same message again

A crash between 1 and 2 will do nothing. A crash between 2 and 3 will redeliver the message again to the update DBMS, so you need to detect duplicates. This means you need to implement the idempotent consumer pattern, whose pitfalls are known and documented.

If your developers don't think hard about failure scenarios (it's hard to think about everything when you are under pressure to deliver working code) then they might get the order wrong and the following can happen instead:

Take a message from Kafka
Save the offset of your Kafka messages read so far - so you don't get the same message again
Update the DBMS

In this case, a crash between 2 and 3 will again lose messages.

The Solution

So how do you prevent data inconsistency with Kafka? Is it at all possible?

Yes, it is possible! You need to store everything inside Kafka. You see, Kafka is based on a logging architecture so it can "log" everything, and its logs can be used to restore the latest state of your domain objects via "event replay". In that sense, the Kafka team's article on exactly-once in Kafka is correct. However, be aware that it is actually insufficient if you use other storage technologies as outlined in the above…

Limitations

If you chose this kind of solution, you have to be aware of the consequence: you are making it hard to migrate away from your existing storage, and you will effectively be locked into Kafka because migrating to a new, future DBMS is likely impossible as long as Kafka does not support XA. To understand why, check out our DBMS migration post. While it may seem a safe choice to go for Kafka today, consider what happens if corporate policies change after, say, a merger or acquisition - when you find yourself with a different technology stack to integrate.

Blog

Similar

Latest in Tech tips

Exactly-Once Delivery With Kafka

The Problem

Messages sent to Kafka can be lost if there is a crash or a bug

Messages received from Kafka can be lost if there is a crash or a bug

The Solution

Limitations

Add a comment

Attachments ($count)

Connect

Corporate Information

Community