In tech world, when a 1.0 version gets released, it’s assumed that the software is stable, mature, and production ready. But for Neha Narkhede, co-founder of Confluent and co-creator of Apache Kafka, the wait for Apache Kafka 1.0 “was less about stability and more about completeness of the vision” she and a team of engineers set to build towards back when they first started Kafka in 2009.
After all, Kafka has been so broadly adopted by thousands of companies for several years, including a third of the Fortune 500 enterprises that continue to trust the platform for their mission-critical applications.
Every software has a unique story to tell in their journey towards 1.0. In case of Kafka, named after acclaimed German writer Franz Kafka (Jay Kreps spilled the beans in a 2014 Quora post), it’s more about the transformation from a messaging system to a distributed streaming platform. “Back in 2009 when we first set out to build Kafka, we thought there should be an infrastructure platform for streams of data. We didn’t start with the idea of making our own software, but started by observing the gaps in the technologies available at the time and realized how they were insufficient to serve the needs of a data-driven organization,” says Neha.
This is interesting because the team was not imagining some hypothetical need, but a real world business need. Not by building Kafka, but by thinking ‘Why did the stream processing startups fail in the 2000’s and 1990’s?’
“They failed because companies did not have the ability to collect these streams and have them laying around to process,” she adds, “the big question we asked ourselves was ‘why not both scale and real-time’? And more broadly, why not build a true infrastructure platform that allows you to build all of your applications on top of it, and have those applications handle streaming data by default.”
And thus followed a multi-stage transformation: implementing a log-like abstraction for continuous streams, making Kafka fault-tolerant and building replication into it, building APIs that made it easy to get data in and out of Kafka and process it, and more recently adding transactions to enable exactly-once semantics for stream processing.
The Version 1.0.0 comes with further performance improvements with exactly-once semantics which avoids sending the same messages multiple times in the case of a connection error. The exactly-once capabilities ensure enterprise stream processing in a controlled manner, as they enable “closure-like functions” for stream processing. Fundamentally, message delivery to an endpoint once, and no more than once, in a distributed stateless systems has been an ongoing challenge. But while guaranteeing exactly-once delivery is still a debated topic, Kafka’s continued enhancements with exactly-once semantics have resulted in its wider acceptance.
Besides, Apache Kafka 1.0.0 has several important improvements such as significantly faster TLS and CRC32C implementations with Java 9 support, faster controlled shutdown, and better JBOD support, among other bug fixes. There are other features which essentially got the nod: Kafka can now tolerate disk failures better, there is better diagnostics for simple authentication and security layer (SASL) authentication failures, and the Streams API has been improved with functional enhancements.
“The nice thing about all this is that while the current instantiation of Kafka’s Streams APIs are in the form of Java libraries, it isn’t limited to Java per se. Kafka’s support for stream processing is primarily a protocol-level capability that can be represented in any language. This is an important distinction. Stream processing isn’t one interface, so there is no restriction for it to be available as a Java library alone. There are many ways to express continual programs: SQL, function-as-a-service or collection-like DSLs in many programming languages. A foundational protocol is the right way to address this diversity in applications around an infrastructure platform,” said Neha.
May be it is this continual improvement that she was talking about as part of Apache Kafka’s decade long completeness of vision, which saw it getting trusted by companies like LinkedIn, Capital One, Goldman Sachs, LinkedIn, Netflix, Pinterest, and New York Times. “Kafka enabled us to process trillions of messages per day in a scalable way. This opened up a completely new frontier for us to efficiently process data in motion to help us better serve Netflix members around the world,” said Allen Wang, Senior Software Engineer at Netflix.
Apache Kafka 1.0 is more than just a release. As the company rightly puts it, 1.0.0 is not a ‘mere bump of the version number’ but a full-fledged streaming platform with the ability to read, write, move and process streams of data with transactional correctness at enterprise-wide scale. It will, in fact, play a bigger role in future if stream processing goes on to become the “central nervous system” for companies worldwide.