Introducing Stateful Functions, an OS framework to easily build and orchestrate distributed stateful apps, by Apache Flink’s Ververica

Last week, Apache Flink’s stream processing company Ververica announced the launch of Stateful Functions. It is an open source framework developed to reduce the complexity of building and orchestrating distributed stateful applications. It is built with an aim to bring together the benefits of stream processing with Apache Flink and Function-as-a-Service (FaaS). Ververica will propose the project, licensed under Apache 2.0, to the Apache Flink community as an open source contribution.

Read More: Apache Flink 1.9.0 releases with Fine-grained batch recovery, State Processor API and more

The co-founder and CTO at Ververica, Stephan Ewen says, “Orchestration for stateless compute has come a long way, driven by technologies like Kubernetes and FaaS — but most offerings still fall short for stateful distributed applications.” He further adds, “Stateful Functions is a big step towards addressing those shortcomings, bringing the seamless state management and consistency from modern stream processing to space.”

Stateful Functions is designed as a simple and powerful abstraction based on functions that can interact with each other asynchronously. It is also composed of complex networks of functionality. This approach helps in eliminating the requirement of additional infrastructure for application state management, reduces operational overhead and also the overall system complexity.

The stateful functions are aimed to help users define independent functions with a small footprint, thus enabling the resources to interact reliably with each other. Each function has a persistent user-defined state in local variables that can be used to arbitrarily message other functions. The stateful function framework simplifies use cases such as:

Asynchronous application processes (checkout, payment, logistics)

Heterogeneous, load-varying event stream pipelines (IoT event rule pipelines)

Real-time context and statistics (ML feature assembly, recommenders)

The runtime of stateful functions API is based on the stream processing capability of Apache Flink. It also extends its powerful model for state management and fault tolerance. The major advantage of this framework is that the state and computation are co-located on the same side of the network. This means that “you don’t need the round-tripper record to fetch state from an external storage system nor a specific state management pattern for consistency.”

Though the stateful functions API is independent of Flink, its runtime is built on top of Flink’s DataStream API and uses a lightweight version of process functions. “The core advantage here, compared to vanilla Flink, is that functions can arbitrarily send events to all other functions, rather than only downstream in a DAG,” stated the official blog.

introducing-stateful-functions-an-os-framework-to-easily-build-and-orchestrate-distributed-stateful-apps-by-apache-flinks-ververica-img-0