9 min read

When I first started evaluating Coherence, one of my biggest concerns was how easy it would be to set up and use, especially in a development environment. The whole idea of having to set up a cluster scared me quite a bit, as any other solution I had encountered up to that point that had the word “cluster” in it was extremely difficult and time consuming to configure.

My fear was completely unfounded—getting the Coherence cluster up and running is as easy as starting Tomcat. You can start multiple Coherence nodes on a single physical machine, and they will seamlessly form a cluster. Actually, it is easier than starting Tomcat.

Installing Coherence

In order to install Coherence you need to download the latest release from the Oracle Technology Network (OTN) website. The easiest way to do so is by following the link from the main Coherence page on OTN. At the time of this writing, this page was located at http://www.oracle.com/technology/products/coherence/index.html, but that might change. If it does, you can find its new location by searching for ‘Oracle Coherence’ using your favorite search engine.

In order to download Coherence for evaluation, you will need to have an Oracle Technology Network (OTN) account. If you don’t have one, registration is easy and completely free.

Once you are logged in, you will be able to access the Coherence download page, where you will find the download links for all available Coherence releases: one for Java, one for .NET, and one for each of the supported C++ platforms.

You can download any of the Coherence releases you are interested in while you are there, but for the remainder of this article you will only need the first one. The latter two (.NET and C++) are client libraries that allow .NET and C++ applications to access the Coherence data grid.

Coherence ships as a single ZIP archive. Once you unpack it you should see the README.txt file containing the full product name and version number, and a single directory named coherence. Copy the contents of the coherence directory to a location of your choice on your hard drive. The common location on Windows is c:coherence and on Unix/Linux /opt/coherence, but you are free to put it wherever you want.

The last thing you need to do is to configure the environment variable COHERENCE_HOME to point to the top-level Coherence directory created in the previous step, and you are done.

Coherence is a Java application, so you also need to ensure that you have the Java SDK 1.4.2 or later installed and that JAVA_HOME environment variable is properly set to point to the Java SDK installation directory.

If you are using a JVM other than Sun’s, you might need to edit the scripts used in the following section. For example, not all JVMs support the -server option that is used while starting the Coherence nodes, so you might need to remove it.

What’s in the box?

The first thing you should do after installing Coherence is become familiar with the structure of the Coherence installation directory.

There are four subdirectories within the Coherence home directory:

  • bin: This contains a number of useful batch files for Windows and shell scripts for Unix/Linux that can be used to start Coherence nodes or to perform various network tests
  • doc: This contains the Coherence API documentation, as well as links to online copies of Release Notes, User Guide, and Frequently Asked Questions documents
  • examples: This contains several basic examples of Coherence functionality
  • lib: This contains JAR files that implement Coherence functionality

Shell scripts on Unix
If you are on a Unix-based system, you will need to add execute permission to the shell scripts in the bin directory by executing the following command:

$ chmod u+x *.sh

Starting up the Coherence cluster

In order to get the Coherence cluster up and running, you need to start one or more Coherence nodes. The Coherence nodes can run on a single physical machine, or on many physical machines that are on the same network. The latter will definitely be the case for a production deployment, but for development purposes you will likely want to limit the cluster to a single desktop or laptop.

The easiest way to start a Coherence node is to run cache-server.cmd batch file on Windows or cache-server.sh shell script on Unix. The end result in either case should be similar to the following screenshot:

There is quite a bit of information on this screen, and over time you will become familiar with each section. For now, notice two things:

  • At the very top of the screen, you can see the information about the Coherence version that you are using, as well as the specific edition and the mode that the node is running in. Notice that by default you are using the most powerful, Grid Edition, in development mode.
  • The MasterMemberSet section towards the bottom lists all members of the cluster and provides some useful information about the current and the oldest member of the cluster.

Now that we have a single Coherence node running, let’s start another one by running the cache-server script in a different terminal window.

For the most part, the output should be very similar to the previous screen, but if everything has gone according to the plan, the MasterMemberSet section should reflect the fact that the second node has joined the cluster:

MasterMemberSet
(
ThisMember=Member(Id=2, ...)
OldestMember=Member(Id=1, ...)
ActualMemberSet=MemberSet(Size=2, BitSetCount=2
Member(Id=1, ...)
Member(Id=2, ...)
)
RecycleMillis=120000
RecycleSet=MemberSet(Size=0, BitSetCount=0)
)

You should also see several log messages on the first node’s console, letting you know that another node has joined the cluster and that some of the distributed cache partitions were transferred to it.

If you can see these log messages on the first node, as well as two members within the ActualMemberSet on the second node, congratulations—you have a working Coherence cluster.

Troubleshooting cluster start-up

In some cases, a Coherence node will not be able to start or to join the cluster. In general, the reason for this could be all kinds of networking-related issues, but in practice a few issues are responsible for the vast majority of problems.

Multicast issues

By far the most common issue is that multicast is disabled on the machine. By default, Coherence uses multicast for its cluster join protocol, and it will not be able to form the cluster unless it is enabled. You can easily check if multicast is enabled and working properly by running the multicast-test shell script within the bin directory.

If you are unable to start the cluster on a single machine, you can execute the following command from your Coherence home directory:

$ . bin/multicast-test.sh –ttl 0

This will limit time-to-live of multicast packets to the local machine and allow you to test multicast in isolation. If everything is working properly, you should see a result similar to the following:

Starting test on ip=Aleks-Mac-Pro.home/192.168.1.7,
group=/237.0.0.1:9000, ttl=0
Configuring multicast socket...
Starting listener...
Fri Aug 07 13:44:44 EDT 2009: Sent packet 1.
Fri Aug 07 13:44:44 EDT 2009: Received test packet 1 from self
Fri Aug 07 13:44:46 EDT 2009: Sent packet 2.
Fri Aug 07 13:44:46 EDT 2009: Received test packet 2 from self
Fri Aug 07 13:44:48 EDT 2009: Sent packet 3.
Fri Aug 07 13:44:48 EDT 2009: Received test packet 3 from self

If the output is different from the above, it is likely that multicast is not working properly or is disabled on your machine.

This is frequently the result of a firewall or VPN software running, so the first troubleshooting step would be to disable such software and retry. If you determine that was indeed the cause of the problem you have two options. The first, and obvious one, is to turn the offending software off while using Coherence.

However, for various reasons that might not be an acceptable solution, in which case you will need to change the default Coherence behavior, and tell it to use the Well-Known Addresses (WKA) feature instead of multicast for the cluster join protocol.

Doing so on a development machine is very simple—all you need to do is add the following argument to the JAVA_OPTS variable within the cache-server shell script:

-Dtangosol.coherence.wka=localhost

With that in place, you should be able to start Coherence nodes even if multicastis disabled.

Localhost and loopback address
On some systems, localhost maps to a loopback address, 127.0.0.1. If that’s the case, you will have to specify the actual IP address or host name for the tangosol.coherence.wka configuration parameter. The host name should be preferred, as the IP address can change as you move from network to network, or if your machine leases an IP address from a DHCP server.

As a side note, you can tell whether the WKA or multicast is being used for the cluster join protocol by looking at the section above the MasterMemberSet section when the Coherence node starts.

If multicast is used, you will see something similar to the following:

Group{Address=224.3.5.1, Port=35461, TTL=4}

The actual multicast group address and port depend on the Coherence version being used. As a matter of fact, you can even tell the exact version and the build number from the preceding information. In this particular case, I am using Coherence 3.5.1 release, build 461.

This is done in order to prevent accidental joins of cluster members into an existing cluster. For example, you wouldn’t want a node in the development environment using newer version of Coherence that you are evaluating to join the existing production cluster, which could easily happen if the multicast group address remained the same.

On the other hand, if you are using WKA, you should see output similar to the following instead:

WellKnownAddressList(Size=1,
WKA{Address=192.168.1.7, Port=8088}
)

Using the WKA feature completely disables multicast in a Coherence cluster, and is recommended for most production deployments, primarily due to the fact that many production environments prohibit multicast traffic altogether, and that some network switches do not route multicast traffic properly.

That said, configuring WKA for production clusters is out of the scope of this article, and you should refer to Coherence product manuals for details.

Binding issues

Another issue that sometimes comes up is that one of the ports that Coherence attempts to bind to is already in use and you see a bind exception when attempting to start the node.

By default, Coherence starts the first node on port 8088, and increments port number by one for each subsequent node on the same machine. If for some reason that doesn’t work for you, you need to identify a range of available ports for as many nodes as you are planning to start (both UDP and TCP ports with the same numbers must be available), and tell Coherence which port to use for the first node by specifying the tangosol.coherence.localport system property. For example, if you want Coherence to use port 9100 for the first node, you will need to add the following argument to the JAVA_OPTS variable in the cache-server shell script:

-Dtangosol.coherence.localport=9100

LEAVE A REPLY

Please enter your comment!
Please enter your name here