Some learnings: December 2018

Friday, December 21, 2018

Limiting Native transport request queue

To limit the max queue size of Native transport requests in Cassandra set below parameter in jvm.options file.

$ cat jvm.options | grep -i native_transport_requests

-Dcassandra.max_queued_native_transport_requests=4096

-Dcassandra.max_queued_native_transport_requests only helps limit the number of blocked threads and in fact will not change active threads at all. In fact it increases the size of the pending queue.

There is another setting that can increase the number of active threads but I have had very limited success it in reducing the number of pending threads: native_transport_max_threads. The default for it is 128.

Reference:

https://issues.apache.org/jira/browse/CASSANDRA-11363

https://support.datastax.com/hc/en-us/articles/360031470531-High-blocked-NTR-count-during-increased-workload-on-Cassandra-node

***

Wednesday, December 12, 2018

How to get Linux/Unix epoch timestamp?

echo $(($(date +%s%N)/1000000))

Search Internet for converting the epoch time stamp to human readable.

***

Checking a Linux/Unix Process uptime and Start time

To check the process uptime, get the process ID with ps command.

Example: ps -ef | grep cassandra

Assuming, process id as 29459

Command to check process uptime:
$ ps -p 29459 -o etime

To get process start time.

$ ps -eo pid,lstart,cmd | grep cassandra

***

How to delete large partitions (Wide Partition) based on Partition.

Follow below steps:

Step 1: Find the partition key from system.log file

Step 2: Check which node the key exists with command

nodetool getendpoints -key

Step 3: Now we have to retrieve the record from sstable. There are two steps for this.

a) Finding which sstable the key is located: nodetool getsstables -key

b) Export the sstable to JSON format to view the problematic key:

bin/sstable2json SSTABLE[-K KEY [-k KEY [... ]]]] [-x KEY [-x KEY [... ]]] [-e ]

Then solution,

DELETE from table where column='partition key'

***

Datastax DevCenter Not working on MAC OS

Datastax DevCenter is having issues on MACOS with the latest version on JAVA installed in it. So as to rectify issue, follow the steps below.

This is caused usually if java1.9 is installed, so you need to use the lower version, example as below.

Try these steps, with using java version, jdk1.8.0_151.jdk, It seems its not working with latest Java version.

sudo su -

cd /Library/Java/JavaVirtualMachines

and remove jdk1.8.0_161.jdk

rm -rf jdk1.8.0_161.jdk

and then back to your user by typing exit

Then type cd

and then rm -rf .devcenter

then start the devcenter

***

Sunday, December 9, 2018

Migrating Datastax Enterprise (DSE) to Open Source Apache Cassandra.

To migrate from DSE to Apache Cassandra, Firstly make sure that you change the replication strategy of DSE related keyspaces which have "Everywhere strategy" to "simplestrategy or networktopologystrategy" other wise after installing the binaries make sure that you copy the dse jar files related to everywhere strategy to the same location of open source jar files and then follow below steps will guide through the process of migrating from DSE to Apache Cassandra.

These steps are only valid if you are running only Cassandra Workload, i mean you haven't enabled search or spark or graph on the specific Datacenter.

Step 1: Repair one node at a time so that the data will be consistent across all nodes.

nohup nodetool repair >> /tmp/$(date +%Y-%m-%d)_`hostname -i`.repairlog 2>&1 &

Step 2: Cleanup one node at a time, so as to remove the unwanted data.

nodhup nodetool cleanup >> /tmp/$(date +%Y-%m-%d)_`hostname -i`.cleanuplog 2>&1 &

Step 3: Prepare Opensource files.

cassandra.yaml, cassandra-env.sh, jvm.options, snitch property file, logback.xml

Once the above steps were done.

Step 4: Make sure swap space is disabled.

swapoff -a

Make persistent by commenting out in /etc/fstab

verify by: free -h

Step 5: Flush the contents, so as to make data immutable on disk.

nodetool flush

Step 6; Take snapshot, so that data can be recovered.

nodetool snapshot

Step 7: Drain the node sot that application cannot connect to that specific node.

nodetool drain

verify the commitlogs were removed.

Step 8: Stop the service as node needs

service dse stop

service opscenterd stop

service datatstax-agent stop

chkconfig dse off

Step 9: Remove the node from other node

nodetool removenode hostid

nodetool removenode status

Step 10: Install new open source binaries and point out the respective files.

Step 11: Start the Cassandra service

service cassandra start

Step 12: Repair the open source node.

nohup nodetool repair >> /tmp/$(date +%Y-%m-%d)_`hostname -i`.repairlog 2>&1 &

Repeat the above steps one node at a time. Once done follow below.

a) Drop the unwanted keyspaces and manyally delete the data on all nodes.

b) Uninstall the DSE

***

Some useful yum commands

1) Clean the yum cache directory.
yum clean all

2) To list installed.
yum list installed

3) List all versions of packages available. Can also query if repository is just added.
yum --showduplicate list [package_name]
yum --showduplicate list firefox

4) Install the specific version.
yum install [package_name]-[version].[architecture]
yum install firefox-31.5.3-3.el7_1.x86_64
yum install cassandra-3.11.2-1

5) To uninstall a package
yum remove [package_name]

6) To list repos
yum repolist
yum repolist all (shows both enable/disabled)
um-config-manager --disable datastax

7) To display path
update-alternatives --display java

***

Uninstall Datastax Enterprise (DSE)

Following guide, helps to uninstall DSE.

service dse stop

service datastax-agent stop

ps auwx | grep dse

ps auwx | grep datastax-agent

sudo yum remove "dse-*" "datastax-*"

Remove data directories manually.

***

Some useful sstable commands in Cassandra

1) Partition key info
sstabeldump /path/abc-Data.db -k primary_key

***

Securing JMX Authentication in Cassandra

Securing connection of nodetool, JConsole and JVisualVM in Cassandra.

1) In cassandra-env.sh file update the following line for JMX authentication to true.

You will have two options for local and remote connection.

JVM_OPTS='$JVM_OPTS -Dcom.sun.management.jmxremote.authenticate=true"
JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.password.file=/etc/cassandra/jmxremote.password"

2) Edit the jmxremote.password file

Copy the template from /<jre_install_dir>/lib/management/jmxremote.password.template and rename it to jmxremote.password and place it under /etc/cassandra

3) Now you need to change the permission of the jmxremote.password file
chown cassandra:cassandra /etc/cassandra/jmxremote.password
chmod 400 /etc/cassandra/jmxremote.password

4) Now edit the jmxremote.password file for user and password.
cassandrauser cassandrapassword

5) Also add cassandra with readwrite permission to /<jre_install_directory>/lib/management/jmxremote.access file
cassandra readwrite

6) Restart DSE/Cassandra
service dse restart
or
service cassandra restart

7) Test it by connecting through nodetool command or JConsole or JVisualVM

***

Installing Python3 on CentOS and how to use it.

Installing Python3 on CentOS 7 system using Software Collections (SCL) along side the default distribution python 2.7

Software collections (SCL) is a community project helps to build, install and use different versons of same software on same system without affecting default packages.

The python2.7 is teh default version which gets installed with CentOS and this help to run programs like yum etc.

To install python3, firstly, install SCL
sudo yum install centos-release-scl

Then, depending of you python version choice among python3.3, 3.4, 3.5, 3.6 run below command.
sudo yum install rh-python36

This will install python 3.6

Now to use python3.6, you need to run.
scl enable rh-python36 bash

Above command will enable python3.6 only for this session.

Verify by
python --version

***

Some nodetool commands in cassandra.

1) To know on which nodes the primary key partition residing.
nodetool getendpoints keyspace_name table_name primary_key

***

Some Cassandra/CQLSH Queries

1) Peers Info:
select peer, data_center from system.peers;

2) Tables Info:
select table_name from system_schema.tables where keyspace_name='xxx';
select keyspace_name, table_name, id from system_schema.tables; (To know table id)

3) Token Info:
select token(partition_key, clustering_column) from keyspace.table_name;
select keyspace_name, table_name from system_schema.tables where keyspace_name IN ('keyspace1', 'keyspace2', 'keyspace3');

4) Keyspaces info:
select * from system_schema.keyspaces;

5) Index Info:
select * from system."IndexInfo";
DSE Related Search-Index Query:
describe active search index schema on keyspace_name.table_name;
alter search index schema on table_name drop field column_name;

***

Saturday, December 8, 2018

Avoiding dependency packages while installing rpm.

To avoid installation of dependency package while installing an rpm, you can do it by doing

rpm -ivh --nodeps *.rpm

Some rpm useful queries.
1) Finding if/which package is installed in RHEL/CENTOS
rpm -qa | grep ntp

2) Doing query list
rpm -ql ntp

***

Adding DataCenter to Cassandra.

These steps guide you through Adding Datacenter in Cassandra.

Step 1: Know how the application is writing/Reading. In the sense if it is quorum set to local_quorum or each_quorum - Datastax Drive - Set to DCAwareRoundRobinPolicy

Step 2: If cassandra is started on new nodes, stop it and remove the data, log, saved_cache, commit log files.

Step 3: Major cassandra.yaml changes
- auto_bootstrap: false
- -seeds
- endpoint snitch
- listen address
- rpc address
- num tokens

Step 4: Update the relevant propert file snitch
- cassandra rackdc properties file

Step 5: Start now cassandra

Step 6: After all nodes start in the Datacenter
- set replication factor of keyspace to dc name and number of replicas needed
- run nodetool rebuild specifying the existing datacenter on all nodes in the new datacenter
- Change to true or remove auto_bootstrap: false in the cassandra.yaml file.

Step 7: We need to update seed node on all nodes in the cluster and do a restart.

Step 8: nohup nodetool rebuild -dc abc-dc1 -ks keyspace_name table_name >> /tmp/$(date +%Y-%m-%d)_`hostname -i`rebuild.log 2>&1 &

***

When to use curly braces in shell scripting?

Consider the following example.
vi failed_curlybraces.sh

#!/bin/sh
echo "Let's create a file with youname!"
echo "What's you name"
read INPUT_NAME
echo "Hoy! $INPUT_NAME, Creating a file with $INPUT_NAME_lovely.txt
touch "INPUT_NAME_lovely.txt"
echo "Hurrraaahhh! nooo fileee created"

If you run the script, it won't create the file. As the shell doesn't know where the variable ends. So now, modify the script.

vi correct_curlybraces.sh

#!/bin/sh
echo "Let's create a file with youname!"
echo "What's you name"
read INPUT_NAME
echo "Hoy! ${INPUT_NAME}, Creating a file with ${INPUT_NAME}_lovely.txt
touch "{INPUT_NAME}_lovely.txt"
echo "Hurrah! file created"

***

When to use square brackets [ ] also know as test in shell scripting?

test is more frequently called as [ . It is normally a shell builtin ( which means shell itself will interpret [ as test)

$ type [
[ is a shell builtin

$which [
/usr/bin/[

$ls -l /usr/bin/[
-rwxr-xr-x 1 root root 41544 Dec 4 2017 /ur/bin/[

$ls -l /usr/bin/test
-rwxr-xr-x 1 root root 37400 Dec 4 2017 /usr/bin/test

Test is mostly invoked with if and while statements, it is also the reason you will come into difficulties if you create a program called test and try to run it. As this shell builtin will be called instead of your program.

***

Persistent Variable by sourcing in shell scripting

With #!/bin/sh in the shell script, it invokes other shell and executes the script from there. So even if the variable values is set, it won't reflect in the script running.

Instead export the variable so that it will take the variable value while executing the script. Here we need to notice that if the variable value is changed in the script it will only to the particular shell execution, you can verify this by echo'ing the variable.

So as to persist the value, you need to source. We can source a script via the "." (dot) command. Sourcing the script effectively runs the script in the same interactive shell.

Example:
vi myscript.sh
echo "test var value: $MY_VAR"
MY_VAR=pramod
echo "latest value of var is: $MY_VAR"

Executing:
Case 1: ./myscript.sh
Output:
test var value:
latest value of var is: pramod

Case 2:
MY_VAR=hoy
./myscript.sh
test var value:
latest value of var is: pramod

Case 3:
export MY_VAR=hoy
./myscript.sh
test var value: hoy
latest value of var is: pramod

Now echo'ing the MY_VAR
echo $MY_VAR
hoy

Case 4:
. ./myscript.sh
test var value: hoy
latest value of var is: pramod

Now echo'ing the MY_VAR
echo $MY_VAR
pramod

This implies, the variable is persistent from the script execution.

***

How #! is treated in shell scripting

In general, # in shell scripting is normally a comment.
But in Shell scripting, #! is the special directive which Unix treats specially. Example as, #!/bin/sh

***

Shell script for finding file modification time.

This script finds the file modified after 120 seconds.

#!/bin/sh
last_modified=`stat -c "%Y" $file`
if [ $((current-$last_modified)) -gt 120 ]; then
echo "old";
else
echo "new"
fi

***

Escape Characters in Shell Scripting

Backslash (\) is the Escape character.

Example: If you need the output like this statement that's below.

Everyone is born with some talent/talents. Few statement's with some !, can be composed @ fo rich # and $, with good % of ^ and & comes befor * now " the closed braces ( and right brace ) ends "

Writing it with escape character \ gives the exact output.

echo Everyone is born with some talent\/talents. Few statement's with some \!, can be composed \@ for rich \# and \$, with good \% of \^ and \& comes before \* now \" the closed braces $ and right brace $ ends \"

***

Some Handy Elassandra Commands

Some of the Elassandra handy statements.

1) Verify Elassandra/ElasticSerach is running.

curl http://127.0.0.1:9200

2) Health status of node.

curl -X GET "127.0.0.1:9200/_cat/health?v"

3) List of nodes and its load:

curl -X GET "127.0.0.1:9200/_cat/nodes?v"

4) Listing all indices.

curl -X GET "127.0.0.1:9200/_cat/indices?v"

5) Delete Index.

curl -XDELETE 'http://127.0.0.1/index_name'

6) Rebuild Index.

nodetool rebuild_index keyspace_name table_name elastic_<table_name>_idx

7) File Descriptors:

curl -X GET "127.0.0.1:9200/_nodes/stats/process?filter_path=**.max_file_scriptors&pretty"

In limits.conf keep nofile to higher values. (/etc/security/limits.conf)

8) Checking how many search contexts are open --depends on file descriptors.

curl -X GET "127.0.0.1:9200/_nodes/stats/indices/search?pretty"

9) Deleting _all search contexts -- depends on file descriptors

curl -X DELETE "127.0.0.1:9200/_search/scroll/_all"

10) Get Mapping & Setting Details:

curl -X GET "127.0.0.1:9200/index_name/_mapping?pretty"

curl -X GET "127.0.0.1:9200/index_name/_settings?pretty"

11) Flush the segments

curl -X POST "127.0.0.1:9200/_flush"

12) Updating settings (similar to analyzers but close and open).

curl -X PUT "127.0.0.1:9200/index_name/_settings" -H 'Content-Type: application/json' -d '

{

"index" : {

"refresh_interval" : "2s"

}

13) Master/Node details

curl -X GET "127.0.0.1:9200/_cat/master?v"

v=> Verbose

14) Help

curl -X GET "127.0.0.1:9200/_cat/master?help"

15) Headers.

curl -X GET "127.0.0.1:9200/_cat/nodes?h=ip,port,heapPercent,name"

16) Numeric Formats.

curl -X GET "127.0.0.1:9200/_cat/indices?bytes=b&pretty" -H "Accept: application/json" | sort -rnk8

Here, 8=> * column

bytes, time, size can be sorted.

***