5 min read

This article by Robbie Strickland, the author of Cassandra High Availability, describes the data replication architecture used in Cassandra.

Replication is perhaps the most critical feature of a distributed data store, as it would otherwise be impossible to make any sort of availability guarantee in the face of a node failure. As you already know, Cassandra employs a sophisticated replication system that allows fine-grained control over replica placement and consistency guarantees.

In this article, we’ll explore Cassandra’s replication mechanism in depth. Let’s start with the basics: how Cassandra determines the number of replicas to be created and where to locate them in the cluster. We’ll begin the discussion with a feature that you’ll encounter the very first time you create a keyspace: the replication factor.

(For more resources related to this topic, see here.)

The replication factor

On the surface, setting the replication factor seems to be a fundamentally straightforward idea. You configure Cassandra with the number of replicas you want to maintain (during keyspace creation), and the system dutifully performs the replication for you, thus protecting you when something goes wrong. So by defining a replication factor of three, you will end up with a total of three copies of the data. There are a number of variables in this equation. Let’s start with the basic mechanics of setting the replication factor.

Replication strategies

One thing you’ll quickly notice is that the semantics to set the replication factor depend on the replication strategy you choose. The replication strategy tells Cassandra exactly how you want replicas to be placed in the cluster.

There are two strategies available:

  • SimpleStrategy: This strategy is used for single data center deployments. It is fine to use this for testing, development, or simple clusters, but discouraged if you ever intend to expand to multiple data centers (including virtual data centers such as those used to separate analysis workloads).
  • NetworkTopologyStrategy: This strategy is used when you have multiple data centers, or if you think you might have multiple data centers in the future. In other words, you should use this strategy for your production cluster.

SimpleStrategy

As a way of introducing this concept, we’ll start with an example using SimpleStrategy. The following Cassandra Query Language (CQL) block will allow us to create a keyspace called AddressBook with three replicas:

CREATE KEYSPACE AddressBook
WITH REPLICATION = {
   'class' : 'SimpleStrategy',
   'replication_factor' : 3
};

The data is assigned to a node via a hash algorithm, resulting in each node owning a range of data. Let’s take another look at the placement of our example data on the cluster. Remember the keys are first names, and we determined the hash using the Murmur3 hash algorithm.

Cassandra High Availability

The primary replica for each key is assigned to a node based on its hashed value. Each node is responsible for the region of the ring between itself (inclusive) and its predecessor (exclusive).

While using SimpleStrategy, Cassandra will locate the first replica on the owner node (the one determined by the hash algorithm), then walk the ring in a clockwise direction to place each additional replica, as follows:

Cassandra High Availability

Additional replicas are placed in adjacent nodes when using manually assigned tokens

In the preceding diagram, the keys in bold represent the primary replicas (the ones placed on the owner nodes), with subsequent replicas placed in adjacent nodes, moving clockwise from the primary.

Although each node owns a set of keys based on its token range(s), there is no concept of a master replica. In Cassandra, unlike make other database designs, every replica is equal. This means reads and writes can be made to any node that holds a replica of the requested key.

If you have a small cluster where all nodes reside in a single rack inside one data center, SimpleStrategy will do the job. This makes it the right choice for local installations, development clusters, and other similar simple environments where expansion is unlikely because there is no need to configure a snitch (which will be covered later in this section).

For production clusters, however, it is highly recommended that you use NetworkTopologyStrategy instead. This strategy provides a number of important features for more complex installations where availability and performance are paramount.

NetworkTopologyStrategy

When it’s time to deploy your live cluster, NetworkTopologyStrategy offers two additional properties that make it more suitable for this purpose:

  • Rack awareness: Unlike SimpleStrategy, which places replicas naively, this feature attempts to ensure that replicas are placed in different racks, thus preventing service interruption or data loss due to failures of switches, power, cooling, and other similar events that tend to affect single racks of machines.
  • Configurable snitches: A snitch helps Cassandra to understand the topology of the cluster. There are a number of snitch options for any type of network configuration.

Here’s a basic example of a keyspace using NetworkTopologyStrategy:

CREATE KEYSPACE AddressBook
WITH REPLICATION = {
   'class' : 'NetworkTopologyStrategy',
   'dc1' : 3,
   'dc2' : 2
};

In this example, we’re telling Cassandra to place three replicas in a data center called dc1 and two replicas in a second data center called dc2.

Summary

In this article, we introduced the foundational concepts of replication and consistency. In our discussion, we outlined the importance of the relationship between replication factor and consistency level, and their impact on performance, data consistency, and availability.

By now, you should be able to make sound decisions specific to your use cases. This article might serve as a handy reference in the future as it can be challenging to keep all these details in mind.

Resources for Article:


Further resources on this subject:


LEAVE A REPLY

Please enter your comment!
Please enter your name here