The read latency in Cassandra can be caused by:
1) Large Partitions(>1GB):This requires Cassandra to load them into memory leading to CPU/memory pressure.
sudo nodetool cfstats | grep -i 'maximum bytes' | awk '{print $5}' | sort -n | tail -n 5
2) Large number of SSTables: Cassandra needs to scan through these to serve read requests.
3) Large number of Tombstones: Tombstones represent deleted data, and if not purged properly leads to read latencies since cassandra has to scan all of them.
cd /var/log/cassandra/;sudo unzip '*.zip';sudo grep -i 'tombstones cell' system.log* | wc -l
4) Long GC Pauses: GC pauses literally stop the process from doing anything else. There fore we cannot have GC pauses >~300ms
cd /var/log/cassandra/;sudo unzip '*.zip';sudo grep -i 'young generation' system.log*
5) Too many Compactions: This background operation can starve read requests.
sudo nodetool compactionhistory
6) Data imbalance leading to hot nodes: If the data isn't evenly spread due to poor choice of partition key.
7) Memory
8) CPU spikes can be caused by:
a) Disk i/o
Moving mount to local SSD and used replace_address to re-hydrate data
When CPU running hot observe %util(should be <100) and %iowait (should be <100)
sudo iostat -x 2
b) Network i/o
dstat -lrvn 10
sudo iftop
***
No comments:
Post a Comment