Data

Apache Kafka 2.3 is here!

3 min read

Two days ago, the Apache Kafka team released the latest version of their open source distributed data streaming software, Apache Kafka 2.3. This release has several improvements to the Kafka Core, Connect and Streams REST API. In this release, a new Maximum Log Compaction Lag has been added. It has also improved monitoring for partitions, and fairness in SocketServer processors and much more.

What’s new in Apache Kafka 2.3?

Kafka Core

Reduced the amount of time the broker spends scanning log files

JIRA optimizes a process such that Kafka has to check its log segments only. In the earlier versions, the time required for log recovery was not proportional to the number of logs. With Kafka 2.3, it has become proportional to the number of unflushed log segments and has made a 50% reduction in broker startup time.

Improved monitoring for partitions which have lost replicas

In this release, Kafka Core has added metrics showing partitions that have exactly the minimum number of in-sync replicas. By monitoring these metrics, users can see partitions that are on the verge of becoming under-replicated. Also, the –under-min-isr command line flag has been added to the kafka-topics command. This will allow users to easily see which topics have fewer than the minimum number of in-sync replicas.

Added a Maximum Log Compaction Lag

In the earlier versions, after the latest key is written, the previous key values in a first-order approximation would get compacted after some time. With this release, it will now be possible to set the maximum amount of time for an old value to stick around. The new parameter max.log.compation.time.ms will specify how long an old value may possibly live in a compacted topic. This will enable Apache Kafka to comply with data retention regulations such as the GDPR.

Improved fairness in SocketServer processors

Apache Kafka 2.3 will prioritize existing connections over new ones and will improve the broker’s resilience to connection storms. It also adds a max.connections per broker setting.

Core Kafka has also improved failure handling in the Replica Fetcher.

Incremental Cooperative Rebalancing in Kafka Connect

In Kafka Connect, worker tasks are distributed among the available worker nodes. When a connector is reconfigured or a new connector is deployed– as well as when a worker is added or removed– the tasks must be rebalanced across the Connect cluster. This helps ensure that all of the worker nodes are doing a fair share of the Connect work. With Kafka 2.3, it will be possible to make configuration changes easier. Kafka Connect has also added connector contexts to Connect worker logs.

Kafka Streams

Users are allowed to store record timestamps in RocksDB

Kafka Streams will have timestamps included in the state store. This will lay the groundwork to ensure future features like handling out-of-order messages in KTables and implementing TTLs for KTables.

Added in-memory window store and session Store

This release has an in-memory implementation for the Kafka Streams window store and session store. The in-memory implementations provide higher performance, in exchange for lack of persistence to disk.

Kafka Streams has also added KStream.flatTransform and KStream.flatTransformValues.

These are some of the select updates, head over to the Apache blog for more details.

Read Next

Amazon Managed Streaming for Apache Kafka (Amazon MSK) is now generally available

Confluent, an Apache Kafka service provider adopts new license to fight against cloud service providers

Twitter adopts Apache Kafka as their Pub/Sub System

Vincy Davis

A born storyteller turned writer!

Share
Published by
Vincy Davis

Recent Posts

Top life hacks for prepping for your IT certification exam

I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…

3 years ago

Learn Transformers for Natural Language Processing with Denis Rothman

Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…

3 years ago

Learning Essential Linux Commands for Navigating the Shell Effectively

Once we learn how to deploy an Ubuntu server, how to manage users, and how…

3 years ago

Clean Coding in Python with Mariano Anaya

Key-takeaways:   Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…

3 years ago

Exploring Forms in Angular – types, benefits and differences   

While developing a web application, or setting dynamic pages and meta tags we need to deal with…

3 years ago

Gain Practical Expertise with the Latest Edition of Software Architecture with C# 9 and .NET 5

Software architecture is one of the most discussed topics in the software industry today, and…

3 years ago