Saturday, January 21, 2017

Possible causes for read latency in Cassandra

The read latency in Cassandra can be caused by:


1) Large Partitions(>1GB):This requires Cassandra to load them into memory leading to CPU/memory pressure.

sudo nodetool cfstats | grep -i 'maximum bytes' | awk '{print $5}' | sort -n | tail -n 5


2) Large number of SSTables: Cassandra needs to scan through these to serve read requests.


3) Large number of Tombstones: Tombstones represent deleted data, and if not purged properly leads to read latencies since cassandra has to scan all of them.

cd /var/log/cassandra/;sudo unzip '*.zip';sudo grep -i 'tombstones cell' system.log* | wc -l


4) Long GC Pauses: GC pauses literally stop the process from doing anything else. There fore we cannot have GC pauses >~300ms

cd /var/log/cassandra/;sudo unzip '*.zip';sudo grep -i 'young generation' system.log*


5) Too many Compactions: This background operation can starve read requests.

sudo nodetool compactionhistory


6) Data imbalance leading to hot nodes: If the data isn't evenly spread due to poor choice of partition key.


7) Memory


8) CPU spikes can be caused by:

    a) Disk i/o

            Moving mount to local SSD and used replace_address to re-hydrate data

            When CPU running hot observe %util(should be <100) and %iowait (should be <100)

            sudo iostat -x 2

    b) Network i/o

            dstat -lrvn 10

            sudo iftop


***

No comments:

Post a Comment