Apache Cassandra: Working in Multiple Datacenter Environments

0
125
5 min read

 

Cassandra High Performance Cookbook

Cassandra High Performance Cookbook

Over 150 recipes to design and optimize large scale Apache Cassandra deployments

        Read more about this book      

(For more resources related to this subject, see here.)

Changing debugging to determine where read operations are being routed

Cassandra replicates data to multiple nodes; because of this, a read operation can be served by multiple nodes. If a read at QUORUM or higher is submitted, a Read Repair is executed, and the read operation will involve more than a single server. In a simple flat network which nodes have chosen for digest reads, are not of much consequence. However, in multiple datacenter or multiple switch environments, having a read cross a switch or a slower WAN link between datacenters can add milliseconds of latency. This recipe shows how to debug the read path to see if reads are being routed as expected.

How to do it…

  1. Edit <cassandra_home>/conf/log4j-server.properties and set the logger to debug, then restart the Cassandra process:

    log4j.rootLogger=DEBUG,stdout,R

  2. On one display, use the tail -f <cassandra_log_dir>/system.log to follow the Cassandra log:

    DEBUG 06:07:35,060 insert writing local
    RowMutation(keyspace='ks1', key='65', modifications=[cf1])
    DEBUG 06:07:35,062 applying mutation of row 65

  3. In another display, open an instance of the Cassandra CLI and use it to insert data. Remember, when using RandomPartitioner, try different keys until log events display on the node you are monitoring:

    [default@ks1] set cf1[‘e'][‘mycolumn']='value';
    Value inserted.

  4. Fetch the column using the CLI:

    [default@ks1] get cf1[‘e'][‘mycolumn'];

    Debugging messages should be displayed in the log.

    DEBUG 06:08:35,917 weakread reading SliceByNamesReadComman
    d(table='ks1', key=65, columnParent='QueryPath(columnFami
    lyName='cf1', superColumnName='null', columnName='null')',
    columns=[6d79636f6c756d6e,]) locally
    ...
    DEBUG 06:08:35,919 weakreadlocal reading SliceByNamesReadCo
    mmand(table='ks1', key=65, columnParent='QueryPath(columnFa
    milyName='cf1', superColumnName='null', columnName='null')',
    columns=[6d79636f6c756d6e,])

How it works…

Changing the logging property level to DEBUG causes Cassandra to print information as it is handling reads internally. This is helpful when troubleshooting a snitch or when using the consistency levels such as LOCAL_QUORUM or EACH_QUORUM, which route requests based on network topologies.

Using IPTables to simulate complex network scenarios in a local environment

While it is possible to simulate network failures by shutting down Cassandra instances, another failure you may wish to simulate is a failure that partitions your network. A failure in which multiple systems are UP but cannot communicate with each other is commonly referred to as a split brain scenario. This state could happen if the uplink between switches fails or the connectivity between two datacenters is lost.

Getting ready

When editing any firewall, it is important to have a backup copy. Testing on a remote machine is risky as an incorrect configuration could render your system unreachable.

How to do it…

  1. Review your iptables configuration found in /etc/sysconfig/iptables. Typically, an IPTables configuration accepts loopback traffic:

    :RH-Firewall-1-INPUT - [0:0]-A INPUT -j RH-Firewall-1-INPUT
    -A FORWARD -j RH-Firewall-1-INPUT
    -A RH-Firewall-1-INPUT -i lo -j ACCEPT

  2. Remove the highlighted rule and restart IPTables. This should prevent instances of Cassandra on your machine from communicating with each other:

    #/etc/init.d/iptables restart

  3. Add a rule to allow a Cassandra instance running on 10.0.1.1 communicate to 10.0.1.2:

    -A RH-Firewall-1-INPUT -m state --state NEW -s 10.0.1.1 -d
    10.0.1.2 -j ACCEPT

How it works…

IPTables is a complete firewall that is a standard part of current Linux kernel. It has extensible rules that can permit or deny traffic based on many attributes, including, but not limited to, source IP, destination IP, source port, and destination port. This recipe uses the traffic blocking features to simulate network failures, which can be used to test how Cassandra will operate with network failures.

Choosing IP addresses to work with RackInferringSnitch

A snitch is Cassandra’s way of mapping a node to a physical location in the network. It helps determine the location of a node relative to another node in order to ensure efficient request routing. The RackInferringSnitch can only be used if your network IP allocation is divided along octets in your IP address.

Getting ready

The following network diagram demonstrates a network layout that would be ideal for RackInferringSnitch.

Apache Cassandra: Working in Multiple Datacenter Environments

How to do it…

  1. In the <cassandra_home>/conf/cassandra.yaml file:

    endpoint_snitch: org.apache.cassandra.locator.RackInferringSnitch

  2. Restart the Cassandra instance for this change to take effect.

How it works…

The RackInferringSnitch requires no extra configuration as long as your network adheres to a specific network subnetting scheme. In this scheme, the first octet, Y.X.X.X, is the private network number 10. The second octet, X.Y.X.X, represents the datacenter. The third octet, X.X.Y.X, represents the rack. The final octet represents the host, X.X.X.Y. Cassandra uses this information to determine which hosts are ‘closest’. It is assumed that ‘closer’ nodes will have more bandwidth and less latency between them. Cassandra uses this information to send Digest Reads to the closest nodes and route requests efficiently.

There’s more…

While it is ideal if the network conforms to what is required for RackInferringSnitch, it is not always practical or possible. It is also rigid in that if a single machine does not adhere to the convention, the snitch will fail to work properly.

LEAVE A REPLY

Please enter your comment!
Please enter your name here