My Adventures in Coding

July 11, 2019

Cassandra – Restarting a Node in a Cluster

Filed under: Cassandra — Brian @ 9:25 pm
Tags: , ,

Overview

If you need to restart a node in a Cassandra cluster, there are a few steps that are important to follow.

It is import to note two things:

  • When you write to a node in a Cassandra cluster it will write to an in memory table, as well as the commit log, then it will periodically flush the in memory tables to disk.
  • Data in a Cassandra cluster is replicated (depending on your replication factor) to multiple nodes so when you write a new value to one node, Cassandra will replicate that update to other nodes.

For these two reasons it is import when restarting Cassandra to drain the node which will do two things:

  • It will force the node to flush all in memory tables to disk
  • It will stop the node from receiving updates from all other nodes in the cluster

NOTE: If you restart Cassandra without shutting it down cleanly, it will recover. It will use the commit logs to rebuild the missing data and sync with the other nodes, but this will also slow down the startup process.

Stop Cassandra

Drain the node. This will flush all in memory tables to disk and stop receiving traffic from other nodes.

nodetool drain

Check the status to see if the drain operation has finished.

systemctl status cassandra

Now it is safe to stop cassandra.

systemctl stop cassandra

Start Cassandra

systemctl start cassandra

Check the status of the Cassandra service.

systemctl status cassandra

Monitor the startup state using journalctl.

journalctl -f -u cassandra

Cassandra has started successfully when you see the following message:

Note: Watch for any errors where Cassandra is having trouble syncing with other nodes.

I hope that helps!

April 3, 2019

Cassandra – Fix Schema Disagreement

Filed under: Cassandra — Brian @ 8:39 am
Tags: , ,

Recently we had an issue when adding a new table to a Cassandra cluster (version 3.11.2). We added a new table create statement to our Java application, deployed our application to our Cassandra cluster in each of our environments, the table was created, we could read and write data, everything was fine.

However, when we deployed to our production Cassandra cluster, the new table was created, but we were unable to query the table from any node in the cluster. When our Java application tried to do a select from the table it would get the error:

Cassandra timeout during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded)

We tried connecting to each node in the cluster and using CQLSH, but we still had the same issue. On every node Cassandra knew about the table and we could see the schema definition for the table, but when we tried to query it we would get the following error:

ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting for replica nodes' response] 
message="Operation timed out - received only 0 responses." info={'received_response': 0, 'required_response': 1, 'consistency': 'ONE'}

We decided to try describe cluster to see if we could get any useful info:

nodetool describecluster

There was our problem! We had a schema disagreement! Three nodes of our six node cluster were on a different schema:

Cluster Information:
        Name: OurCassandraCluster
        Snitch: org.apache.cassandra.locator.SimpleSnitch
        DynamicEndPointSnitch: enabled
        Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
        Schema versions:
                819a3ce1-a42c-3ba9-bd39-7c015749f11a: [10.111.22.103, 10.111.22.105, 10.111.22.104]

                134b246c-8d42-31a7-afd1-71e8a2d3d8a3: [10.111.22.102, 10.111.22.101, 10.111.22.106]

We checked DataStax, which had the article Handling Schema Disagreements. However, their official documentation was sparse and was assuming a node was unreachable.

In our case all the nodes were reachable, the cluster was functioning fine, all previously added tables were receiving traffic, it was only the new table we just added that was having a problem.

We found a StackOverlow post suggesting a fix for the schema disagreement issue was to cycle the nodes, one at a time. We tried that and it did work. The following are the steps that worked for us.

Steps to Fix Schema Disagreement

If there are more nodes in one schema than in the other, you can start by trying to restart a Cassandra node in the smaller list and see if it joins the other schema list.

In our case we had exactly three nodes on each schema. In this case it is more likely the nodes in the first schema are the ones that Cassandra will pick during a schema negotiation, so try the following instructions on one of the nodes in the second schema list.

Connect to a node

Connect to one of the nodes in the second schema list. For this example lets pick node “10.111.22.102”;

Restart Cassandra

First, drain the node. This will flush all in memory sstables to disk and stop receiving traffic from other nodes.

nodetool drain

Now, check the status to see if the drain operation has finished.

systemctl status cassandra

You should see in the output that the drain operation was completed successfully.
Drained_Node_Confirmation

Stop Cassandra

systemctl stop cassandra

Start Cassandra

systemctl start cassandra

Verify Cassandra is up

Lets check the journal to ensure Cassandra has restarted successfully

journalctl -f -u cassandra

When you see the following message, it means Cassandra has finished restarting and is ready for clients to connect.

Cassandra_Startup_Completed

Verify Schema Issue Fixed For Node

Now that Cassandra is back up, run the describe cluster command again to see if the node has switched to the other schema:

nodetool describecluster

If all has gone well, you should see that node “10.111.22.102” has moved to the other schema list (Note: The node list is not sorted by IP):

Cluster Information:
        Name: OurCassandraCluster
        Snitch: org.apache.cassandra.locator.SimpleSnitch
        DynamicEndPointSnitch: enabled
        Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
        Schema versions:
                819a3ce1-a42c-3ba9-bd39-7c015749f11a: [10.111.22.103, 10.111.22.102, 10.111.22.105, 10.111.22.104]

                134b246c-8d42-31a7-afd1-71e8a2d3d8a3: [10.111.22.101, 10.111.22.106]

If Node Schema Did Not Change

If this did not work, it means the other schema is the one Cassandra has decided is the authority, so repeat these steps for the list of nodes in the first schema list.

Fixed Cluster Schema

Once you have completed the above steps on each node, all nodes should now be on a single schema:

Cluster Information:
        Name: OurCassandraCluster
        Snitch: org.apache.cassandra.locator.SimpleSnitch
        DynamicEndPointSnitch: enabled
        Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
        Schema versions:
                819a3ce1-a42c-3ba9-bd39-7c015749f11a: [10.111.22.103, 10.111.22.102, 10.111.22.101, 10.111.22.106, 10.111.22.105, 10.111.22.104]

I hope that helps!

February 22, 2019

Cassandra – FSReadError on Startup

Filed under: Cassandra — Brian @ 5:43 pm
Tags: , , , ,

Recently we encountered an error with one node in a Cassandra cluster where the Cassandra service said it was running but we would get a failure when we tried to connect:

# cqlsh
Connection error: ('Unable to connect to any servers', 
{'127.0.0.1': error(111, "Tried connecting to [('127.0.0.1', 9042)]. 
Last error: Connection refused")})

So we decided to tail the journal to see if we could find a useful error message:

journalctl -f -u cassandra

While monitoring the journal output we saw the following exception recurring roughly every minute:

Nov 19 04:17:35 cassdb188 cassandra[17259]: Exception (org.apache.cassandra.io.FSReadError) encountered during startup: java.io.EOFException
Nov 19 04:17:35 cassdb188 cassandra[17259]: FSReadError in /var/lib/cassandra/hints/b22dfb1b-6a6e-44ce-9c7c-fda1e75293af-1542627895660-1.hints
Nov 19 04:17:35 cassdb188 cassandra[17259]: at org.apache.cassandra.hints.HintsDescriptor.readFromFile(HintsDescriptor.java:235)

The file that Cassandra was having problems with was a 0 byte hint file.

The following Stack Overflow post suggested that to resolve the problem you just needed to remove this file. We tried this solution and it worked.

Steps to Fix FSReadError Startup Problem

Stop Cassandra

systemctl stop cassandra

Move the suspect hint file into a temporary folder (just to be safe)

mv /var/lib/cassandra/hints/b22dfb1b-6a6e-44ce-9c7c-fda1e75293af-1542627895660-1.hints /tmp

Start Cassandra

systemctl start cassandra

Verify the error has stopped

journalctl -f -u cassandra

Now verify you can connect using CQLSH

# cqlsh
Connected to PointServiceClusterV3 at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.2 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh>

Note: In our case this happened on a Cassandra instance in a test environment that had not been shutdown cleanly so there was no worry about data integrity. However, if this had happened on a node in a production environment I would recommend running nodetool repair on the node.

nodetool repair

I hope that helps!

January 29, 2019

Cassandra – Switching from SimpleStrategy to NetworkTopologyStrategy

When we started using Cassandra I setup clusters for our test, staging, and production environments. Then we created the initial keyspace for our application, added tables, started using them, and everything worked fine.

Later we decided that for our up-time requirements we wanted to have a second cluster in another data center to act as a hot fail-over on production. No problem, Cassandra has us covered. However, when we had originally created our application’s keyspace, it was created with the default replication strategy SimpleStrategy. For a fail-over cluster we need Cassandra to be configured with the NetworkTopologyStrategy. No big deal right, should be an easy fix?

After reading through the documentation on Changing keyspace replication strategy, I was left with one question:

“What do I use for the data center name?”.

With SimpleStrategy you specify the number of nodes to which you want each item replicated by specifying the parameter “replication_factor”, for example (‘replication_factor’ : 3) . However, when using NetworkTopologyStrategy you use the data center name to specify how many nodes you want to have copies of the data, for example (‘mydatacentername’, 3). I was worried that if I altered the strategy on one node then the cluster thought the node was not part of the same data center, this would cause some serious problems.

Fortunately, it turns out that Cassandra has a default data center name, “datacenter1”, which you can use when making this switch, kudos to the person who replied to this StackOverflow post.

Of course I was not going to try this switch out on any of our clusters until I was confident it would work. I setup a test cluster using SimpleStrategy with replication factor set to 3, added data to the cluster, ran a nodetool repair, then I altered the strategy for the keyspace, verified nothing had changed as expected, then I ran nodetool repair again, and once again verified all my data was intact. So it worked as promised.

Switch Replication Strategy

Note: In this example, the keyspace we are switching the replication strategy on is called “examplekeyspace”.

Open a cqlsh prompt on any node in the cluster

Check the current replication strategy

SELECT * FROM system_schema.keyspaces;

show_keyspaces_before

Verify the default data center name

SELECT data_center FROM system.local;

show_data_center_name

Alter the existing Keyspace

Alter the keyspace using the data center name (make sure you copy it exactly!) as the replication factor and set the number of nodes for replication to be the same as before.

ALTER KEYSPACE ExampleKeyspace WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': '3'};

Now if you check the keyspace on each node in the cluster you will see that the replication strategy is now NetworkTopologyStrategy.

SELECT * FROM system_schema.keyspaces;

alter_keyspace_network_topology

Nodetool Repair

Switching the replication strategy does not cause any data to be moved between nodes, you would need to run nodetool repair to do that. However, if all you are doing is switching an existing cluster with a single rack and datacenter from SimpleStrategy to NetworkTopologyStrategy it should not require any data be moved. But if you would like to be thorough it does not hurt to run a nodetool repair.

Run nodetool repair

nodetool repair -pr examplekeyspace

Using the option “pr – primary range only” means repair will only repair the keys that are known to the current node where repair is being run, and on other nodes where those keys are replicated. Make sure to run repair on each node, but only do ONE node at a time.

Conclusion

When I started using Cassandra I did not realize that for data replication how much of a limitation SimpleStrategy imposes. So if all you want is a single rack in a single datacenter, SimpleStrategy works, however if there is even the slightest possibility you might one day add a failover cluster in another data center or nodes in one data center but on multiple racks, use NetworkTopologyStrategy. Personally, for anything other than a local test cluster, I would always go with NetworkTopologyStrategy.

That is all!

January 27, 2018

Cassandra – Getting Started with Java

Filed under: Cassandra — Brian @ 9:15 pm
Tags: , ,

Cassandra is a great tool for storing time series data and I happen to be using it on my current project for that exact purpose.

There are several ways to use Cassandra from Java and many ways to improve performance, but here I just want to provide a simple “Getting Started” example. So here it is!

First, download the current version of Cassandra V3 from here.

Extract the tar.gz file:

 tar -zxvf apache-cassandra-3.11.1-bin.tar.gz
 

Change directory into the bin folder:

 cd apache-cassandra-3.11.1/bin
 

Start Cassandra:

 ./cassandra -f
 

Create a Java project, if using Maven, you can add the following dependencies to your pom.xml file:

<dependency>
    <groupId>com.datastax.cassandra</groupId>
    <artifactId>cassandra-driver-core</artifactId>
    <version>3.3.0</version>
</dependency>

Here is a simple Java example showing how to connect to Cassandra, create a keyspace, create a table, insert a row, and select a row:

import com.datastax.driver.core.*;

import java.time.Instant;
import java.time.ZoneId;
import java.util.Date;
import java.util.UUID;

public class CassandraV3Tutorial {

    private final static String KEYSPACE_NAME = "example_keyspace";
    private final static String REPLICATION_STRATEGY = "SimpleStrategy";
    private final static int REPLICATION_FACTOR = 1;
    private final static String TABLE_NAME = "example_table";

    public static void main(String[] args) {

        // Setup a cluster to your local instance of Cassandra
        Cluster cluster = Cluster.builder()
                .addContactPoint("localhost")
                .withPort(9042)
                .build();

        // Create a session to communicate with Cassandra
        Session session = cluster.connect();

        // Create a new Keyspace (database) in Cassandra
        String createKeyspace = String.format(
                "CREATE KEYSPACE IF NOT EXISTS %s WITH replication = " +
                        "{'class':'%s','replication_factor':%s};",
                KEYSPACE_NAME,
                REPLICATION_STRATEGY,
                REPLICATION_FACTOR
        );
        session.execute(createKeyspace);

        // Create a new table in our Keyspace
        String createTable = String.format(
                "CREATE TABLE IF NOT EXISTS %s.%s " + "" +
                        "(id uuid, timestamp timestamp, value double, " +
                        "PRIMARY KEY (id, timestamp)) " +
                        "WITH CLUSTERING ORDER BY (timestamp ASC);",
                KEYSPACE_NAME,
                TABLE_NAME
        );
        session.execute(createTable);

        // Create an insert statement to add a new item to our table
        PreparedStatement insertPrepared = session.prepare(String.format(
                "INSERT INTO %s.%s (id, timestamp, value) values (?, ?, ?)",
                KEYSPACE_NAME,
                TABLE_NAME
        ));

        // Some example data to insert
        UUID id = UUID.fromString("1e4d26ed-922a-4bd2-85cb-6357b202eda8");
        Date timestamp = Date.from(Instant.parse("2018-01-01T01:01:01.000Z"));
        double value = 123.45;

        // Bind the data to the insert statement and execute it
        BoundStatement insertBound = insertPrepared.bind(id, timestamp, value);
        session.execute(insertBound);

        // Create a select statement to retrieve the item we just inserted
        PreparedStatement selectPrepared = session.prepare(String.format(
                "SELECT id, timestamp, value FROM %s.%s WHERE id = ?",
                KEYSPACE_NAME,
                TABLE_NAME));

        // Bind the id to the select statement and execute it
        BoundStatement selectBound = selectPrepared.bind(id);
        ResultSet resultSet = session.execute(selectBound);

        // Print the retrieved data
        resultSet.forEach(row -> System.out.println(
                String.format("Id: %s, Timestamp: %s, Value: %s",
                row.getUUID("id"),
                row.getTimestamp("timestamp").toInstant().atZone(ZoneId.of("UTC")),
                row.getDouble("value"))));

        // Close session and disconnect from cluster
        session.close();
        cluster.close();
    }
}

If you would like to look at the data in your local Cassandra database, you can use the CQLSH command line tool.

So from the bin folder type:

 ./cqlsh

This will take you to a “cqlsh>” prompt.

To view all available Keyspaces:

 DESCRIBE KEYSPACES;

You will now see our “example_keyspace” database:

 system_schema system system_traces
 system_auth system_distributed example_keyspace

To switch to that Keyspace:

 USE example_keyspace;

To show all tables in the keyspace:

 DESCRIBE TABLES;

You will be shown all tables which includes “example_table”.

Now from the command line you can view the data in the table by using a select statement:

 select * from example_table;

Which will show the following information:

 id | timestamp | value
 --------------------------------------+---------------------------------+-------
 1e4d26ed-922a-4bd2-85cb-6357b202eda8 | 2018-01-01 01:01:01.000000+0000 | 123.45

I hope that helps!

Note: The documentation on the DataStax website is very good.

Blog at WordPress.com.