3 min to read
Migrating a Cassandra Cluster
So Cassandra, what in the name of god Cassandra is? Well, it is officially defined as: “Cassandra is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data.Cassandra’s support for replicating across multiple datacenters is best-in-class, providing lower latency for your users and the peace of mind of knowing that you can survive regional outages.”
Most of our infrastructure is ran over Containers, Cassandra is not the exception, that being said, we have three nodes of Cassandra running on the Datacenter and we are just using docker and mapped volumes into the host, easy right?. That’s bassically it, when going to production, we needed to migrate all the data on the test envs (compose by the Cassandra Cluster) to the live envs (more Cassandras) now running on both AWS Instances and other hosts on the Data Center, by doing so we migrated 25 tables very easily by leveraging the use of Amazon S3 and Docker. We just ran the following commands for each table and voilá, we moved all the data in around 1 hour.
Of course it only took 1 hour after perfectioning the process, without having the following cheat sheets we would have accomplish such task in no less than 4 to 5 hours, as we were moving 25 tables running on multiple cassandras runing in 4 docker containers, if you know containers, you know that their ids can be very extend and write confusing once you have like 30 containers running on a single host. Either way, we were able to do it at the end by doing the following.
All the magic happens here:
For Table named keyword_volumes_index
On step 2 we execute the command “nodetool snapshot -t backup1 -cf keyword_volumes_index apollo” to execture the nodetool command into the docker container, which is the one doing the backup, we call the backup “backup1” and we do it to the table “keyword_volumes_index” from the keyspace “apollo” (this is the name of the database).
After all of this we successfully migrated a Cassandra cluster with no pain nor tears!
I have created a NEW POST talking about how to monitor Cassandra using Prometheus, as once you have a Cassandra (or any other service) you need to not just monitor them but truly manage and monitor them!.