Sunday, December 9, 2018

Migrating Datastax Enterprise (DSE) to Open Source Apache Cassandra.

To migrate from DSE to Apache Cassandra, Firstly make sure that you change the replication strategy of DSE related keyspaces which have "Everywhere strategy" to "simplestrategy or networktopologystrategy" other wise after installing the binaries make sure that you copy the dse jar files related to everywhere strategy to the same location of open source jar files and then follow below steps will guide through the process of migrating from DSE to Apache Cassandra.

These steps are only valid if you are running only Cassandra Workload, i mean you haven't enabled search or spark or graph on the specific Datacenter.

Step 1: Repair one node at a time so that the data will be consistent across all nodes.
nohup nodetool repair >> /tmp/$(date +%Y-%m-%d)_`hostname -i`.repairlog 2>&1 &

Step 2: Cleanup one node at a time, so as to remove the unwanted data.
nodhup nodetool cleanup >> /tmp/$(date +%Y-%m-%d)_`hostname -i`.cleanuplog 2>&1 &

Step 3: Prepare Opensource files.
cassandra.yaml, cassandra-env.sh, jvm.options, snitch property file, logback.xml

Once the above steps were done.

Step 4: Make sure swap space is disabled.
swapoff -a
Make persistent by commenting  out in /etc/fstab
verify by: free -h

Step 5: Flush the contents, so as to make data immutable on disk.
nodetool flush

Step 6; Take snapshot, so that data can be recovered.
nodetool snapshot

Step 7: Drain the node sot that application cannot connect to that specific node.
nodetool drain
verify the commitlogs were removed.

Step 8: Stop the service as node needs
service dse stop
service opscenterd stop
service datatstax-agent stop
chkconfig dse off

Step 9: Remove the node from other node
nodetool removenode hostid
nodetool removenode status

Step 10: Install new open source binaries and point out the respective files.

Step 11: Start the Cassandra service
service cassandra start

Step 12: Repair the open source node.
nohup nodetool repair >> /tmp/$(date +%Y-%m-%d)_`hostname -i`.repairlog 2>&1 &

Repeat the above steps one node at a time. Once done follow below.

a) Drop the unwanted keyspaces and manyally delete the data on all nodes.
b) Uninstall the DSE

***

3 comments:

  1. In step 12a, What do you mean by "delete the data on all nodes" ? What data?

    ReplyDelete
    Replies
    1. nevermind. After rereading this, and getting a better understanding of what's in those keyspaces, i get it. the data to delete is just in those unwanted keyspaces. I'm all good now. Thank you for your write-up!

      Delete
  2. Hi,

    Thanks for the post.

    In Step 3: Prepare Opensource files.

    Does it mean we need to point data location and commit_log location/path of DSE needs to updated?

    And newly install Apache Cassandra tar/software will be using this locations/paths?

    ReplyDelete