Wednesday, May 5, 2021

High Disk Space Utilization

 High Disk space with "df" command

Scenario: Observing high disk space being utilized by disk but can't find any files with high size. Usually this is as below.

df -h /var/log

Filesystem                       Size  Used Avail Use% Mounted on

/dev/mapper/centos_vg-lv_varlog   12G  7.6G  4.5G  63% /var/log

du -sh /var/log

569M    /var/log

As from above, we can see from 'du' command the size is in mb where as in df command it shows 63% is occupied which is in gb.

This implies there were some deleted files hanging in the directory, which are not recycled and can be observed with lsof command.

Ex: sudo lsof /var/log | grep deleted

splunkd    2088      root   44r   REG  253,4 6925406608 1049233 /var/log/cassandra/debug.log (deleted)

java      14801 cassandra  443w   REG  253,4   20971608 3145811 /var/log/spark/master/master.log (deleted)

java      28197 cassandra  532w   REG  253,4  485919252 1048702 /var/log/cassandra/system.log (deleted)

java      28197 cassandra  534w   REG  253,4 6925406860 1049233 /var/log/cassandra/debug.log (deleted)

In our case observed that splunk and java were holding some deleted/log-rotated files and thus consuming disk space.

Solution/Workaround: 

Typical solution: After a file has been identified, free the file used space by shutting down the affected process. If a graceful shutdown does not work, then issue the kill command to forcefully stop it by referencing the PID.

From above lsof output, Column-4 represents the File descriptor id.

Ex: 

java      28197 cassandra  534w   REG  253,4 6925406860 1049233 /var/log/cassandra/debug.log (deleted)

In this 534 is the file-descriptor(fd)

Here, alternatively you can find that file's symbolic link as broken by

sudo file /proc/28197/fd/534

/proc/28197/fd/534: broken symbolic link to `/var/log/cassandra/debug.log (deleted)'

Then, truncate the corresponding process fd with

echo > /proc/pid/fd/fd_number

Ex: echo > /proc/28197/fd/534

Note: Need to be root user.

Reference: https://access.redhat.com/solutions/2316

***

No comments:

Post a Comment