In this blog we will look how to install a Cassandra cluster on a local dev machine. We will look into two possible cluster configurations – all nodes on same machine, and cluster nodes on different machines.

This is primarily helpful for developers who want to set up a sandbox environment for Cassandra cluster and get their hands dirty on it.

This is not recommended for a production environment.

1.0 Pre-requisites

You need to have docker already installed on your local machine.

Please refer to following document for docker installation – Install Docker Engine

2.0 Cassandra cluster on a single machine

  • We will create a docker network
  • Create two cassandra containers
  • Provide configuration to the containers, so that they know of each other ip addresses

Lets get started

2.1 Create Network

First we will create a docker network.
Both the cassandra docker containers will be within this network.

You can create new docker network using command

docker network create <network-name>

Let us create a network named cassandra-network

2.2 Create the first Cassandra node

Now we will run our first cassandra docker container.

  • Image used will be cassandra:3.11. You can choose other versions also.

You can run the cassandra docker container with following command

docker run \
   --network <network-name> \
   --name <container-name> \
   -d cassandra:3.11

e.g.

As you can see, we are running a cassandra docker container with name as cassandra-node1 and it’s IP Address is 172.21.0.2.

You can check logs of this container with following command

docker logs -f <container-name>

e.g.

docker logs -f cassandra-node1

2.3 Create the second Cassandra node

Now when we run the second cassandra container, we need it to form a cluster with first one.

For this we need to provide an additional environment variable CASSANDRA_SEEDS, and provide either the container name or IP Address of the first cassandra container.

docker run \
   --network <network-name> \ 
   --name <container-name> \ 
   -e CASSANDRA_SEEDS=<IP/Hostname of first Cassandra node> \ 
   -d cassandra:3.11

e.g.

As you can see, we are now running a second cassandra docker container with name as cassandra-node2 and it’s IP Address is 172.21.0.3

2.4 Cluster formation

Now that we have both cassandra nodes running as docker containers. Because we had provided the CASSANDRA_SEEDS environment variable, the second node would have initiated communication with first node and formed a cassandra cluster.

You can check the logs of first docker container to verify the same.

As you can see above, a handshake is initiated between both nodes, and the second node (IPAddress = 172.21.0.3) is now part of cluster with first node.

2.5 Verify by creating some tables on one of the cassandra node.

Lets verify the same by connecting to first node and creating some tables in that db.

We will connect to first node, and run cqlsh and then run some commands to create tables

docker run -it \
   --network <network-name> \
   --rm cassandra:3.11 \
   cqlsh <cassandra-running-container-name>

As you can see that we connected to first container (cassandra-node1), and create two tables – user and book_reviews.

If you check the logs of the first container (cassandra-node1), you can see the create new table being executed on first cassandra node.

Now lets check the log of second container (cassandra-node2) – we will find that the tables are already replicated in this cassandra node via migration.


3.0 Cassandra cluster spanning different machines

What if we wanted to create a cassandra cluster spanning different machines.

The process is almost same with some minor differences

  • We need to run cassandra docker container on two machines
  • Provide the IP Address of the HOST machine in CASSANDRA_SEEDS environment variable.
  • In additional we will also need to do following
    • Expose port 7000
    • Add an environment variable for CASSANDRA_BROADCAST_ADDRESS and provide the IPAddress of the current host.
    • Optionally you can also provide a cluster name via environment variable CASSANDRA_CLUSTER_NAME

This is how we can do it.

Run this on first host (suppose IP Address is 192.168.1.32)

docker run --name cassandra-1 \
  -e CASSANDRA_BROADCAST_ADDRESS=192.168.1.32 \
  -e CASSANDRA_CLUSTER_NAME="demo cluster" \
  -p 7000:7000 \
  -d cassandra:3.11

Run this on second host (Suppose IP Address is 192.168.1.11)

docker run --name cassandra-2 \
  -e CASSANDRA_BROADCAST_ADDRESS=192.168.1.11 \
  -e CASSANDRA_SEEDS=192.168.1.32 \
  -e CASSANDRA_CLUSTER_NAME="demo cluster" \
  -p 7000:7000 \
  -d cassandra:3.11

e.g.

If you check logs of first cassandra container, you can see that as the second node goes up or down it is automatically added and removed from the cluster


With this we come to end of our blog.

In future blogs we will see more interesting topics like partitions, clustering keys, data modeling, etc in Cassandra.