Tags
I recently went through the exciting task of decommissioning a old node then joining a new Cassandra node in place on an existing data center. Although the documentation out on the web is pretty good for this I thought it was worth a post on a couple of the technicalities and “what to expect” questions I had prior to starting the process. We are using datastax enterprise 4.8.8 with Cassandra 2.1 . The basic steps to decommission a node on a DC in Cassandra are outlined in the Cassandra 2.1 documentation.
Our case was to decommission a node then adding a new node in production without doing these basic steps :
- Turn off all clients writing to the DC (don’t forget Datastax-Agents)
- Run a repair to ensure your data is replicated out
- Change all Keyspaces so they don’t reference the datacenter
but we only did :
- Run nodetool decommission on each node one by one then after it was finished successfully, we did a joining node in place.
The basic steps to join a node on a DC in Cassandra are outlined in the Cassandra 2.1 documentation.
Of note. The decommission process was not pretty fast because we did that in production environment without turning off all client, the repair service was running and schedule on opscenter 5.2.4 and keyspaces were not changed. Based on the decommission documentation the node will “streaming its data to the next node on the ring” .In practice there was much data to stream because we did not do the step 3 and there already have Keyspaces assigned to this DC. The command for decommission took around a 2 hours to run and for joining 40 minutes for me.
One thing to note is that decommission create quite a bit of load on the other data centers in the cluster because we have four . It wasn’t enough to noticeably move the read/write response times but it was enough that I wouldn’t do a bunch of nodes all at once. I put them in a loop with a 10 minutes sleep in between decommissioning an old node and joining the new node by using the token ring of the old node then 30 minutes sleep in between joining a new node and decommissioning the next old node. Sure the script takes a long time around 3.5 hours but safety first in production.
Finally a neat little piece of info, if you want to see what you did check out on the nodetool gossipinfo. It will still show all your removed nodes and also show the difference in which method they were removed.
We used also the command nodetool netstats to check out the number of files deommission was streaming on others nodes.
Below the script shell i used :
</pre> #!/bin/bash for i in 10.252.72.46,10.252.72.134 10.252.72.49,10.252.72.138 10.252.72.47,10.252.72.142 10.252.72.10,10.252.72.146 10.252.72.22,10.252.72.150 10.252.72.48,10.252.72.154 do oldnode=$(echo $i|cut -d',' -f1) newnode=$(echo $i|cut -d',' -f2) echo "Decommissioning node "$oldnode" from the datacenter CDC at `date +%Y.%m.%d-%H.%M.%S`" ssh -t $oldnode /usr/bin/nodetool decommission while [ true ];do echo "Checking to see if $oldnode has finished decommissionning yet at `date +%Y.%m.%d-%H.%M.%S`" STATUS=$(ssh -t $oldnode /usr/bin/nodetool status|grep ^"$oldnode" |awk '{print $1}') # break the loop when we achieve UN status, the extra x char is there so the conditional # won't fail if $STATUS is null if [ "x${STATUS}" = "x" ];then # wait 10 mins between decommissionning old node and joinning a new node sleep 600 echo "joining new node "$newnode" to the datacenter CDC at `date +%Y.%m.%d-%H.%M.%S`" ssh -t $newnode sudo service dse start ssh -t $newnode sudo service datastax-agent start # sleeping for 5 mins because the command above returns before the server is #actually started! if we get some grep usage errors # after this 5 minute period, it's not a big deal, it just means we entered the #while loop prematurely. it will work itself out. sleep 300 ssh -t $newnode sudo service datastax-agent start while [ true ];do echo "Checking to see if $newnode has finished joining yet" STATUS1=$(ssh -t $newnode /usr/bin/nodetool status|grep "$newnode" |awk '{print $1}') # break the loop when we achieve UN status, the extra x char is there so the # conditional won't fail if $STATUS1 is null if [ "x${STATUS1}" = "xUN" ]; then break fi echo "Still joining $newnode, sleeping for next 60 seconds" sleep 60 done break fi echo "Still decommissionning $oldnode, sleeping for next 60 seconds" sleep 60 # break done echo "End of decommissionning $oldnode and joining $newnode to the datacenter CDC at `date +%Y.%m.%d-%H.%M.%S`" ssh -t $oldnode sudo service dse stop ssh -t $newnode sudo service datastax-agent stop # kill switch for this process, just touch /tmp/stop-migration to stop this process gracefully if [ -f /tmp/stop-migration ];then echo "/tmp/stop-migration found, stopping migration" exit 0 fi # wait 30 mins between decommissionning nodes echo "sleeping 30 mins before decommissionning next old node" sleep 1800 done <pre>