Diskless Kafka - The Cloud-Native Revolution within Kafka

TL;DR: Diskless Kafka Is the Future—And the Future Is Open

Note: Huge Kudos to Aiven for sponsoring this blog and leading the diskless proposal for the benefit of the broader Kafka ecosystem rather than narrow profit motives. This blog will go deep into exploring Kafka's historical strengths, analyze its emerging vulnerabilities in modern environments, and reveal why Diskless Kafka represents not merely an evolution but the essential future of this transformative technology. Enjoy!

Kafka has been the backbone of streaming use cases. Messaging, real-time stream processing, Log Aggregation and processing, website analytics and click-stream tracking, data pipelines and integration (change data capture) are the top use-cases you might have seen Kafka used for and I am sure there are many more.

BUT to be honest, here is the an uncomfortable truth - Kafka has been steadily losing ground in Cloud environments because of two main reasons

High costs
Operational pain

Before we understand the above problems and proposed solutions, we need a quick refresher on the basics. If you already know how Kafka works you can jump ahead. But a 2-mins refresher is probably still worth it.

How does Kafka work? A 2-mins refresher:

Key concepts:

Producer: A component responsible for producing messages.
Consumer: A component responsible for consuming messages and performing some action.
- Offset: Consumers keep track of their position on a partition, to be able to resume from a specific position on failures. Ensuring there are no gaps in consuming messages. (read: no data loss for the consumer)
Broker: A server that is responsible for persisting messages to Disk, ensuring durability. Messages are retained for a specific period (called retention period) allowing consumers to replay or catch up on missed data.
Kafka Cluster: A group of brokers (and some other components, not relevant for this discussion)
Topics and Partitions: Data is organised into Topics (ex. orders, payments etc) and each Topic is divided into several partitions.
- Why partitions? Because you shouldn't (almost always) store all your data on a single physical machine and you need to split the data and use multiple machines.
Leadership: Each partition has a leader broker, which stores the main copy for that partition. But there are multiple copies (aka replicas) stored on other broker machines.
- ex. Partition-1 is the main copy and Replica-1 is the replicated copy on another machine.
Replication: Messages are replicated to more than one broker machine, to ensure fault tolerance. Ask: What happens when one broker dies? Replication will help avoid data loss.

The Write Path:

As you can see in the sequence diagram below the critical path is shown in green. Kafka writes a batch to the memory or OS cache and relies on replication for durability rather than “fsync” to disk. However, there are configurations available for you to tune how frequently the flush to disk will be done.

Kafka provides you many tuning possibilities. Check it out later.

Why was Kafka's Disk-based architecture powerful?

It provided amazing throughput and low latency without sacrificing on reliability.
Replication and disk-based storage protect against data loss and enable reliable recovery from failures.
Producers and Consumers are decoupled.
Partitioning allows Kafka to distribute load and scale horizontally by simply adding more brokers and partitions.
Persistent logs let consumers replay events, which is invaluable for debugging, recovery, or building new applications that need historical data.

Why is Kafka's Disk-based architecture weak for cloud environments?

1. High storage costs due to replication:

Kafka’s default three-replica setup means you pay for three times the raw storage, and when factoring in required free space for safe operations, the effective cost per GB can be up to 9x higher than object storage solutions like S3. This makes long-term storage on local disks (especially SSDs or cloud block storage like EBS) extremely expensive for large Kafka clusters.

Below is a formula to understand this better. Apart from the cost_per_gb, there are other important factors that are important but easy to miss.

As a wise man said:

"Be a lover of peace and always run your Databases and Kafka cluster will enough free space"

2. Operational complexity and Scaling challenges:

Scaling out is usually done to maintain performance, when you need more processing power and IO capacity to handle growing traffic. But, each Kafka broker manages its own local storage, creating strong coupling between compute and storage. Scaling out (adding brokers or rebalancing partitions) is resource-intensive, slow, and risky, often requiring hours or days and consuming significant disk and network I/O, which impacts cluster availability and performance.

But why is it slow and resource intensive? When in doubt, write a Math equation.
Following is a simple equation for "Time to Rebalance".

The numerator represents the total data that needs to be moved.
The denominator is the minimum of the cluster’s total network bandwidth and total disk I/O capacity, representing the bottleneck resource.

Why the min ? Because, rebalance time is governed by whichever resource (network or disk) is most constrained.
Apart from the time taken for rebalance, it also has IO impact:

IO impact in percentage is basically the total data you want to move, divided by the total I/O activity in the cluster (N * I) multiplied by the time it lasts.

So basically, the overall rebalancing process is IO intensive which may result in throttled production workloads, reduced throughput and increased latency and none of this sounds good (mainly when you are on-call).

So "Compute and Storage coupling" is a real problem, as it can affect overall cluster availability and performance.

3. Lack of elasticity and resource waste:

We discussed how scaling-out/rebalancing is slow, resource-intensive and risky, but more importantly it results in a lack of elasticity. But why?

Let's observe the below graphs, which shows how Black Friday traffic surge may spike the incoming messages/sec, but the time taken for rebalancing can impact throughput and latency, which means, "elasticity" (scaling quickly) isn't really feasible. So usually over provisioning is done to avoid impact on performance which results in resource waste.

4. Cost of networking and Storage costs compound the problem:

In cloud environments, not only are SSDs and block storage expensive, but cross-zone replication and network traffic incur significant additional costs.
In a typical multi-AZ cluster with replication factor of 3, a single message produced in a cluster may travel cross-zone through the network multiple times.
Ex. In the below setup we have 3 zones, 4 consumers and a replication factor of 3. It shows how a single message produced is sent through the network 3x and cross-zone 4x. (for replication and message consumption).

Network costs of a single message isn't a problem, but when you send billions of messages a day it adds up quickly.

"Storage and compute" coupling seems like a real problem of this architecture and it very much is in Cloud environments. Not just Kafka, but databases also face similar problems and hence storage-compute separation is a key decision Database engineers take in modern databases.

But what about Tiered Storage?

Good question!

Let's first ask, What is Tiered storage?

If you think about it, a typical Kafka cluster serves three types of data.

Hot data: Freshly produced stream of messages that need to be processed in real time. (low latency). ex. real-time fraud detection.

Warm data: Data that is still very relevant for processing and debugging. ex. order delivery events.

Cold data: Data that is very rarely fetched and not really used for processing, but is used for purposes like auditing/compliance, failure recovery and analytics. ex. GDPR audit trails.

With this mental model, we can see that technically we don't need all the data in local SSDs. Which means, on local SSDs you only need to store hot data (or in some cases warm data).

But where would the warm and cold data go? We ideally need storage that is cheap and provides good durability and low latency. And as soon as we think about this, Object storage comes to mind. And yes, it is the best possible option for cold data.

How does this change our Architecture?

Basically, we have two tiers now. Local Storage and Remote Object storage. Local storage uses SSD/NVMe based storage for low latency use cases (real time processing) while Remote storage provides you the ability to store data for longer duration at cheaper costs.

Does it solve our problems?

YES (one problem), now you can save storage costs, because you can store and serve old data from remote tier i.e. object storage.

BUT, looking at the following problems/limitations it seems like the juice isn't worth the squeeze.

Increased Operational Complexity:

Tiered storage adds significant complexity rather than simplifying Kafka operations. With object storage there are new failure modes, operational tasks, and most importantly reasoning about the system's performance is harder.
Limited Cost Savings and Persisting High Local Storage Costs:

While tiered storage can reduce some storage costs by offloading cold data to object storage, users still need to provision significant local disk space for hot data and to handle peak loads. This means high cloud disk costs persist, and network traffic fees (especially across availability zones) remain a major expense.
Lack of True Elasticity and Difficult Scaling:

Kafka’s architecture, even with tiered storage, remains tightly coupled between brokers and local disks. This makes rapid scaling (in or out) difficult and risky, as partition reassignment is still slow and resource-intensive. The lack of elasticity leads to resource over-provisioning and wasted costs
Feature and Portability Limitations:

Tiered storage does not support compacted topics, and once enabled, it cannot be disabled. Migrating services with tiered storage enabled across regions or clouds is restricted.

With the above discussions I hope that it's clear that Kafka needs innovation, a game changing solution to this problem, otherwise Kafka will no longer be the backbone of streaming in the Cloud environment. This is unfortunate, but very true.

The (potential) Solution?

Welcome, Disk-less topics that are proposed as part of KIP-1150.

Very recently Aiven, has proposed a fantastic KIP which is revolutionary for the Kafka ecosystem because it precisely tackles the two main problems in the traditional Kafka deployments in the Cloud.

Heavy local disk state, making it operationally painful
Cost of interzone data transfer

This KIP proposes an opt-in feature that enables Kafka to write directly to object storage, bypassing the old write path that involved broker-to-broker replication, and asynchronously flushing to local disks etc.

But wait? What's new with "write directly to object storage" ?

This is a well-crafted idea and a fundamental rethink of how Kafka should be working in the Cloud environments. In my experience, solutions that address the root cause rather than merely adding workarounds consistently prove more effective and enduring.

Let's cut to the chase and understand the proposal. Only then will we clearly see why it is an innovation that can save Kafka for the greater good.

Following image shows how Diskless topics would work.

As it might be already clear, it is possible to set up diskless mode at the topic level while creating a topic, which means you as a user can decide if you want a diskless topic or a classic one.

kafka-topics.sh --create --topic orders --config topic.type=diskless

One important thing to notice is that with classic topics you have a leader based setup, where replication happens (like we discussed before), while with Disk-less topics any broker can accept a write.

This is powerful, because you don't need leadership election, cross-zone writes from producers (see you orange lines producer 2 and 3). No more leaders, no more hotspots, no more inter-AZ traffic.

Another significant advantage is that object storage delivers built-in durability, eliminating the need for data replication between brokers.

By removing replication and disk-writes from the critical path, we achieve remarkable improvements: no more lengthy cluster rebalancing, consistent throughput, and elimination of latency spikes. Most compelling of all, your cross-zone network costs decrease while your revenue increases.

Now think about scaling and elasticity. With less state (or no state if you only have diskless topics) rebalancing operations become substantially faster and less disruptive. This streamlined architecture transforms Kafka into a truly lightweight and elastic system, capable of adapting to changing workloads with great agility.

Why is this a game-changer?

Because when this proposal is accepted, it will be a part of Kafka, and not a fork or a vendor-locked feature. Which means, it is directly going to benefit hundreds and thousands of Kafka users without moving away from Kafka.

However, it's essential to acknowledge that this solution isn't universal—the slightly increased latency when writing to object storage versus local writes on the leader makes it unsuitable for certain scenarios. This reality underscores the brilliance of supporting both diskless-topics and classic-topics within the same framework. As a user, you gain the ability to fine-tune costs and operational overhead according to specific use case requirements—all seamlessly integrated within a single unified system.

Let's recap

Diskless topics aren't just a tweak, they are a fundamental rethink:

Leaderless, brokerless replication: Any broker can write to any partition. No more leaders, no more hotspots, no more inter-AZ traffic jams.
Object storage as the new backbone: Data is chunked and written directly to your favorite object store. Brokers cache data briefly, then flush to storage based on size or time. No more disk bottlenecks, no more IOPS anxiety.
Instant Autoscaling: No persistent data on brokers means you can spin them up or down in seconds. Kafka finally scales like the cloud-native service it was always meant to be.
When this KIP is accepted, Kafka will leap into a new league of streaming architecture, offering a unified way of achieving super low-latency with classic topics AND ultra cheap scalable storage with Diskless-topic. This dual capability positions Kafka as not merely relevant, but the definitive backbone of modern streaming ecosystems.
Diskless topics will be part of Kafka, so when accepted it will directly benefit the entire Kafka ecosystem.
KIP-1150 is a hot topic so get involved and shape the future of streaming.

Important Links:

Thank you for reading!

Cheers,
The GeekNarrator