Some learnings: May 2024

Tuesday, May 21, 2024

Enabled TPC backpressure with 256 pending requests limit

The back pressure messages you are observing are from TPC (Thread per Core) back pressure:

Example log:

INFO [main] 2020-12-12 23:20:46,067 EpollTPCEventLoopGroup.java:141 - Enabled TPC backpressure with 256 pending requests limit, remote multiplier at 5, global multiplier at 11

INFO [main] 2020-12-12 23:20:46,067 EpollTPCEventLoopGroup.java:146 - TPC extended backoff is enabled

INFO [main] 2020-12-12 23:20:46,067 TPC.java:138 - Created 11 epoll event loops.

INFO [main] 2020-12-12 23:20:46,073 TPC.java:153 - Created 2 TPC timers due to configured ratio of 5.

The TPC back pressure implementation is designed to avoid overloading the server with either client or replica requests (read/write/range), which could make the server unresponsive and/or lead to long garbage collections and out of memory errors. The setting for this is via tpc_pending_requests_limit which defaults to 256.

The setting affected by the back_pressure_enabled setting is used to restrict the network connections which are "on hold" from other nodes at the network level. This will trigger TPC back pressure if it is enabled AND "on hold" network connections exceed the setting value.

The "tpc_concurrent_requests_limit" and "tpc_pending_requests_limit" settings are set in the cassandra.yaml file but they are not included by default. They are currently not documented.

***

Monday, May 20, 2024

Getting number of vertices in Tigergraph DB

1) Get vertex id from config.yaml under data directory for corresponding graph and vertex that we are trying to get count.

Example: $ view /tigergraph/data/gstore/0/part/config.yaml

2) Filter the vertex ID from above

$ grep -r -l 'VertexTypeId: 144' /tigergraph/data/gstore/0/part/*/segmentconfig.yaml | xargs grep 'NumOfVertices:' | grep -v "NumOfVertices: 0"

Here,

grep : plain text search

-r : recursive search 0

-l : Print only the names of files with matching lines

xargs : A command that builds and executes command lines from standard input.

(or)

Filtering for deleted vertices.

$ grep -r -l 'VertexTypeId: 43' /tigergraph/data/gstore/0/part/*/segmentconfig.yaml | xargs grep 'NumOfDeletedVertices: '

***

Getting Tigergraph DB data size

1) gstatusgraph get statistics of Graph data on TigerGraph database

$ gstatusgraph
=== graph ===
[GRAPH  ] Graph was loaded (/tigergraph/data/gstore):
[m1     ] Partition size: 2.11GiB, IDS size: 34MiB, Vertex count: 43416881, Edge count: 11131981, NumOfDeletedVertices: 7705 NumOfSkippedVertices: 0
[m2     ] Partition size: 2.1GiB, IDS size: 34MiB, Vertex count: 882153, Edge count: 11580575, NumOfDeletedVertices: 7714 NumOfSkippedVertices: 0

2) Checking "topology_memory.txt" under "GPE" logs

view /tigergraph/logs/gpe/topology_memory.txt

Note: From above all the sizes are shown corresponding to its nodes and combinedly for all the graphs.

There is no way to identify graph size for individual graph.

***

Saturday, May 18, 2024

Using "rsync" utility in Linux

"rsync" is a sync utility in Linux but its not bi-directional.

Simple command to run rsync is as below, it will copy folder1 to folder2 which is a backup location.

$ rsync /home/bypramod/folder1 /backup/folder2

Using much better options

$ rsync -rvaz --dry-run /home/bypramod/folder1/ bypramod@127.0.0.1:/storage/backup/folder2/

Options:

-r : recurssive

-v : verbose

-a : archive mode to sync meta data like timestamp of file etc.

-z : to compress during transfer

--dry-run : to estimate how it runs

/ : at end of folder name makes sure under it is copied

--delete : use this to sync from source to target, because as sync is not bi-directional at least this will delete if there are any deleted files in source.

From above remove --dry-run for actual run.

Command:

$ rsync -rvaz /home/bypramod/folder1/ bypramod@127.0.0.1:/storage/backup/folder2/

***

Sunday, May 12, 2024

Knowing how much memory utilized by a query/request in Tigergraph DB

1) Find the request id from restpp.

$ grun_p all 'grep -i "requestinfo" $(gadmin config get System.LogRoot)/restpp/RESTPP#*.INFO | grep -v "__INTERNAL_API__" | tail'

Example from one of the node:

(or)

Get running query details from Graph studio UI

2) Once the query is finished take the request id from above and grep for watermark in GPE logs and sum it up for overall usage across cluster.

$ grun_p all 'grep -i "1714777311869" $(gadmin config get System.LogRoot)/gpe/log.INFO | grep -i "Finished in" | grep "mem watermark"'

***

Add on:

To get complete info about the request

In restpp logs

$ grun_p all 'grep -i "1714777296251" $(gadmin config get System.LogRoot)/restpp/RESTPP#*.INFO'

Thursday, May 9, 2024

Keystore/Truststore File Format Conversion

Here, we are converting from PKCS12 file format to JKS

Use the following command to perform the conversion:

$ keytool -importkeystore -srckeystore pramodtruststore-stage.pkcs12 -srcalias pramodroot -destkeystore pramodtruststore_stage_new.jks -deststoretype jks -deststorepass Tigergraph0099 -destalias pramodroot

Note: In the new truststore "pramodtruststore_stage_new.jks" we kept the alias name as same "pramodroot"

***

Saturday, May 4, 2024

Cassandra read latency increased in past few days

As a root user clear the cache as below.

# sync echo 3 > /proc/sys/vm/drop_caches

Then monitor the node's disk IO using this command as root: iostat -tx 1

Observe: Does %iowait constantly climbs over 1%?

***