LOG server running out of space, but this is not Indexes
LOG server running out of space, but this is not Indexes
Hello Nagios Log team
Our Log Server is running out of space (500 GB volume, only 2 months of rather small indexes, backups forwarded to the separate NFS volume)
Not sure where all the usage is coming from and need your advice fast
Thank you
Our Log Server is running out of space (500 GB volume, only 2 months of rather small indexes, backups forwarded to the separate NFS volume)
Not sure where all the usage is coming from and need your advice fast
Thank you
Re: LOG server running out of space, but this is not Indexes
What version of Log Server are you using? You can grab it from the bottom left hand side of the web interface.
Please PM me a copy of your profile, you can download it from Admin > System Status by clicking the Download System Profile button.
Include the output of these commands in that PM:
Please PM me a copy of your profile, you can download it from Admin > System Status by clicking the Download System Profile button.
Include the output of these commands in that PM:
Code: Select all
df -h
df -i
uname -a
cat /etc/*releaseRe: LOG server running out of space, but this is not Indexes
Please see the output as follow:ssax wrote:What version of Log Server are you using? You can grab it from the bottom left hand side of the web interface.
Please PM me a copy of your profile, you can download it from Admin > System Status by clicking the Download System Profile button.
Include the output of these commands in that PM:
Code: Select all
df -h df -i uname -a cat /etc/*release
Code: Select all
/dev/sda1 976M 284M 626M 32% /boot
fikc-isilon01.res.kcg.global:/ifs/data/fikc-nagxiprod01-backup 300G 239G 62G 80% /mnt/nfs/backup
tmpfs 3.2G 0 3.2G 0% /run/user/1000
tmpfs 3.2G 0 3.2G 0% /run/user/48
tmpfs 3.2G 0 3.2G 0% /run/user/0
[root@fikc-naglsprod11 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 16G 0 16G 0% /dev
tmpfs 16G 220K 16G 1% /dev/shm
tmpfs 16G 1.6G 15G 11% /run
tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/mapper/centos-root 490G 427G 42G 92% /
/dev/sda1 976M 284M 626M 32% /boot
fikc-isilon01.res.kcg.global:/ifs/data/fikc-nagxiprod01-backup 300G 239G 62G 80% /mnt/nfs/backup
tmpfs 3.2G 0 3.2G 0% /run/user/1000
tmpfs 3.2G 0 3.2G 0% /run/user/48
tmpfs 3.2G 0 3.2G 0% /run/user/0
[root@fikc-naglsprod11 ~]# clear
[root@fikc-naglsprod11 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 16G 0 16G 0% /dev
tmpfs 16G 220K 16G 1% /dev/shm
tmpfs 16G 1.6G 15G 11% /run
tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/mapper/centos-root 490G 428G 42G 92% /
/dev/sda1 976M 284M 626M 32% /boot
fikc-isilon01.res.kcg.global:/ifs/data/fikc-nagxiprod01-backup 300G 239G 62G 80% /mnt/nfs/backup
tmpfs 3.2G 0 3.2G 0% /run/user/1000
tmpfs 3.2G 0 3.2G 0% /run/user/48
tmpfs 3.2G 0 3.2G 0% /run/user/0
[root@fikc-naglsprod11 ~]# df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
devtmpfs 4093495 397 4093098 1% /dev
tmpfs 4096421 13 4096408 1% /dev/shm
tmpfs 4096421 1298 4095123 1% /run
tmpfs 4096421 16 4096405 1% /sys/fs/cgroup
/dev/mapper/centos-root 32571392 131408 32439984 1% /
/dev/sda1 65536 365 65171 1% /boot
fikc-isilon01.res.kcg.global:/ifs/data/fikc-nagxiprod01-backup 629145600 221 629145379 1% /mnt/nfs/backup
tmpfs 4096421 1 4096420 1% /run/user/1000
tmpfs 4096421 1 4096420 1% /run/user/0
[root@fikc-naglsprod11 ~]# uname -a
Linux fikc-naglsprod11 3.10.0-1160.15.2.el7.x86_64 #1 SMP Wed Feb 3 15:06:38 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
[root@fikc-naglsprod11 ~]# cat /etc/*release
CentOS Linux release 7.9.2009 (Core)
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
CentOS Linux release 7.9.2009 (Core)
CentOS Linux release 7.9.2009 (Core)
[root@fikc-naglsprod11 ~]#
Re: LOG server running out of space, but this is not Indexes
Please PM me a copy of your profile, you can download it from Admin > System Status by clicking the Download System Profile button.
Send the output of this command as well:
Send the output of this command as well:
Code: Select all
sudo du -ah /* | sort -rn | head -n 50Re: LOG server running out of space, but this is not Indexes
Profile attached.ssax wrote:Please PM me a copy of your profile, you can download it from Admin > System Status by clicking the Download System Profile button.
Send the output of this command as well:
Code: Select all
sudo du -ah /* | sort -rn | head -n 50
Here is the command output:
Code: Select all
login as: root
[email protected]'s password:
Last login: Thu Jun 3 15:21:27 2021 from pf1rfv2r.res.kcg.global
[root@fikc-naglsprod11 ~]# sudo du -ah /* | sort -rn | head -n 50
du: cannot access ‘/proc/1193/task/1320/fdinfo/2379’: No such file or directory
du: cannot access ‘/proc/1193/task/1320/fdinfo/2386’: No such file or directory
du: cannot access ‘/proc/1193/task/1328/fdinfo/2011’: No such file or directory
du: cannot access ‘/proc/1193/task/1328/fdinfo/2032’: No such file or directory
du: cannot access ‘/proc/1193/task/1570/fdinfo/2388’: No such file or directory
du: cannot access ‘/proc/1193/task/1570/fdinfo/2397’: No such file or directory
du: cannot access ‘/proc/1193/task/1584/fd/2124’: No such file or directory
du: cannot access ‘/proc/1193/task/1630/fdinfo/1958’: No such file or directory
du: cannot access ‘/proc/1193/task/1630/fdinfo/2366’: No such file or directory
du: cannot access ‘/proc/1193/task/1630/fdinfo/2379’: No such file or directory
du: cannot access ‘/proc/1193/task/1630/fdinfo/2397’: No such file or directory
du: cannot access ‘/proc/1193/task/1635/fdinfo/2397’: No such file or directory
du: cannot access ‘/proc/46882/task/46882/fd/3’: No such file or directory
du: cannot access ‘/proc/46882/task/46882/fdinfo/3’: No such file or directory
du: cannot access ‘/proc/46882/fd/3’: No such file or directory
du: cannot access ‘/proc/46882/fdinfo/3’: No such file or directory
du: cannot access ‘/proc/61005/task/54828/fd/627’: No such file or directory
1020K /var/www/html/nagioslogserver/application/language/pt_PT
1020K /usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/zookeeper-1.4.11-java/ext
1020K /usr/local/nagioslogserver/elasticsearch/data/a62b3b39-5815-4f18-82ab-828b9c557090/nodes/0/indices/logstash-2021.05.31/4/index/_ev4.cfs
1020K /usr/local/nagioslogserver/elasticsearch/data/a62b3b39-5815-4f18-82ab-828b9c557090/nodes/0/indices/logstash-2021.05.15/0/index/_eyh_Lucene41_0.tip
1020K /usr/local/nagioslogserver/elasticsearch/data/a62b3b39-5815-4f18-82ab-828b9c557090/nodes/0/indices/logstash-2021.05.13/0/index/_f0q_Lucene41_0.tip
1020K /usr/lib/firmware/mellanox/mlxsw_spectrum-13.2000.2714.mfa2
1020K /usr/lib/firmware/mellanox/mlxsw_spectrum-13.2000.2308.mfa2
1020K /usr/lib/firmware/iwlwifi-cc-a0-46.ucode
1020K /usr/lib/firmware/dpaa2/mc/mc_10.16.2_lx2160a.itb
1020K /opt/puppetlabs/puppet/lib/ruby/gems/2.7.0/gems/ffi-1.13.1/ext/ffi_c/libffi-x86_64-linux
1020K /mnt/nfs/backups/indices/logstash-2021.05.13/0/__5
1016K /var/www/html/nagioslogserver/application/language/pt_PT/LC_MESSAGES
1016K /usr/share/locale/pa
1016K /usr/local/nagioslogserver/elasticsearch/data/a62b3b39-5815-4f18-82ab-828b9c557090/nodes/0/indices/logstash-2021.05.26/3/index/_gd3.fdx
1016K /usr/local/nagioslogserver/elasticsearch/data/a62b3b39-5815-4f18-82ab-828b9c557090/nodes/0/indices/logstash-2021.05.26/1/index/_g2x_Lucene410_0.dvd
1016K /usr/local/nagioslogserver/elasticsearch/data/a62b3b39-5815-4f18-82ab-828b9c557090/nodes/0/indices/logstash-2021.05.26/0/index/_g86_Lucene410_0.dvd
1016K /usr/lib/firmware/iwlwifi-7265D-29.ucode
1016K /usr/lib/firmware/iwlwifi-3168-29.ucode
1016K /usr/lib/firmware/dpaa2/mc/mc_10.16.2_ls2088a.itb
1016K /usr/bin/grub2-mkrescue
1012K /usr/share/locale/pa/LC_MESSAGES
1012K /usr/local/nagioslogserver/elasticsearch/data/a62b3b39-5815-4f18-82ab-828b9c557090/nodes/0/indices/logstash-2021.05.21/0/index/_f1d_Lucene41_0.tip
1012K /usr/local/nagioslogserver/elasticsearch/data/a62b3b39-5815-4f18-82ab-828b9c557090/nodes/0/indices/logstash-2021.05.15/2/index/_ewa_Lucene41_0.tip
1012K /usr/local/nagioslogserver/elasticsearch/data/a62b3b39-5815-4f18-82ab-828b9c557090/nodes/0/indices/logstash-2021.05.07/4/index/_cd4_Lucene41_0.tip
1012K /usr/lib/firmware/iwlwifi-7265D-27.ucode
1012K /usr/lib/firmware/iwlwifi-3168-27.ucode
1012K /opt/puppetlabs/puppet/lib/ruby/gems/2.7.0/gems/concurrent-ruby-1.1.5/lib
1012K /mnt/nfs/backups/nagioslogserver/indices/logstash-2021.05.07/4/__t
1008K /usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/slyphon-zookeeper_jar-3.3.5-java/lib/zookeeper-3.3.5.jar
1008K /usr/local/nagioslogserver/elasticsearch/data/a62b3b39-5815-4f18-82ab-828b9c557090/nodes/0/indices/logstash-2021.05.13/1/index/_eyu_Lucene41_0.tip
1008K /usr/local/nagioslogserver/elasticsearch/data/a62b3b39-5815-4f18-82ab-828b9c557090/nodes/0/indices/logstash-2021.04.24/2/index/_ew0_Lucene41_0.tip
1008K /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.282.b08-1.el7_9.x86_64/jre/lib/jfr.jar
1008K /usr/lib/firmware/mediatek/mt8183
1008K /usr/lib/firmware/iwlwifi-7265D-22.ucode
1008K /usr/libexec/openssh
1008K /mnt/nfs/backups/nagioslogserver/indices/logstash-2021.04.24/2/__2k
1008K /mnt/nfs/backups/indices/logstash-2021.05.13/1/__i
1007M /usr/local/nagioslogserver/elasticsearch/data/a62b3b39-5815-4f18-82ab-828b9c557090/nodes/0/indices/nagioslogserver_history/0/index/_an610_Lucene41_0.pos
1004K /usr/local/nagioslogserver/logstash/vendor/jruby/lib/ruby/shared/jopenssl.jar
1004K /usr/local/nagioslogserver/elasticsearch/data/a62b3b39-5815-4f18-82ab-828b9c557090/nodes/0/indices/logstash-2021.06.01/0/index/_f2x.cfs
1004K /usr/local/nagioslogserver/elasticsearch/data/a62b3b39-5815-4f18-82ab-828b9c557090/nodes/0/indices/logstash-2021.05.21/2/index/_f0t_Lucene41_0.tip
1004K /usr/lib/firmware/mediatek/mt8183/scp.img
1004K /usr/lib/firmware/iwlwifi-3168-22.ucode
1004K /opt/puppetlabs/puppet/lib/ruby/vendor_gems/gems/gettext-3.2.2/samples/locale
1004K /mnt/nfs/backups/nagioslogserver/indices/logstash-2021.05.08/2/__2e
1000K /usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/edn-1.1.1/spec/exemplars
1000K /usr/local/nagioslogserver/elasticsearch/data/a62b3b39-5815-4f18-82ab-828b9c557090/nodes/0/indices/logstash-2021.05.02/0/index/_el4_Lucene41_0.tip
1000K /usr/local/nagioslogserver/elasticsearch/data/a62b3b39-5815-4f18-82ab-828b9c557090/nodes/0/indices/logstash-2021.04.24/4/index/_ew3_Lucene41_0.tip
1000K /usr/local/nagioslogserver/elasticsearch/data/a62b3b39-5815-4f18-82ab-828b9c557090/nodes/0/indices/logstash-2021.04.12/4/index/_ego_Lucene41_0.tip
1000K /mnt/nfs/backups/nagioslogserver/indices/logstash-2021.05.02/0/__1e
[root@fikc-naglsprod11 ~]#
You do not have the required permissions to view the files attached to this post.
Re: LOG server running out of space, but this is not Indexes
What is the output of this command?
You'll need to find out where all the space is being consumed by doing this:
Continue doing that for the large directories until you find where all the data is being consumed, I'm unable to see what it is consuming it from the output or from your profile.
Code: Select all
lsof | grep deletedCode: Select all
cd /
du -sh *
cd /largedirectory
du -sh *
cd /largedirectory/nextlargestdirectory
du -sh *
etc ..Re: LOG server running out of space, but this is not Indexes
This is what the problem is (appr 200 GBs are "missing" in between df -h and du sh *ssax wrote:What is the output of this command?
You'll need to find out where all the space is being consumed by doing this:Code: Select all
lsof | grep deleted
Continue doing that for the large directories until you find where all the data is being consumed, I'm unable to see what it is consuming it from the output or from your profile.Code: Select all
cd / du -sh * cd /largedirectory du -sh * cd /largedirectory/nextlargestdirectory du -sh * etc ..
[root@fikc-naglsprod11 data]# cd /
[root@fikc-naglsprod11 /]# du -sh *
0 bin
281M boot
136K dev
40M etc
176K home
0 lib
0 lib64
16K lost+found
4.0K media
531G mnt
468M opt
du: cannot access ‘proc/1193/task/1561/fdinfo/1830’: No such file or directory
du: cannot access ‘proc/1193/task/1561/fdinfo/1866’: No such file or directory
du: cannot access ‘proc/1193/task/1561/fdinfo/1985’: No such file or directory
du: cannot access ‘proc/1193/task/1568/fdinfo/1830’: No such file or directory
du: cannot access ‘proc/1193/task/1569/fdinfo/1866’: No such file or directory
du: cannot access ‘proc/1193/task/1569/fdinfo/2312’: No such file or directory
du: cannot access ‘proc/1193/task/1570/fd/1985’: No such file or directory
du: cannot access ‘proc/1193/task/1577/fdinfo/1985’: No such file or directory
du: cannot access ‘proc/1193/task/1581/fdinfo/1866’: No such file or directory
du: cannot access ‘proc/1193/task/1582/fd/1866’: No such file or directory
du: cannot access ‘proc/1193/task/1881/fd/205’: No such file or directory
du: cannot access ‘proc/1193/task/1881/fd/209’: No such file or directory
du: cannot access ‘proc/1193/task/1881/fd/349’: No such file or directory
du: cannot access ‘proc/1193/task/1881/fd/402’: No such file or directory
du: cannot access ‘proc/1193/task/1895/fd/209’: No such file or directory
du: cannot access ‘proc/1193/task/1895/fd/2211’: No such file or directory
du: cannot access ‘proc/11730’: No such file or directory
du: cannot access ‘proc/11731’: No such file or directory
du: cannot access ‘proc/11736’: No such file or directory
du: cannot access ‘proc/11981/task/11981/fd/3’: No such file or directory
du: cannot access ‘proc/11981/task/11981/fdinfo/3’: No such file or directory
du: cannot access ‘proc/11981/fd/3’: No such file or directory
du: cannot access ‘proc/11981/fdinfo/3’: No such file or directory
du: cannot access ‘proc/61005/task/61079/fdinfo/377’: No such file or directory
du: cannot access ‘proc/61005/task/61123/fd/441’: No such file or directory
du: cannot access ‘proc/61005/task/61169/fdinfo/627’: No such file or directory
0 proc
864K root
1.7G run
0 sbin
4.0K srv
12K store
0 sys
2.2M tmp
200G usr # ------------ > only large directory with ElasticSearch data !
967M var
[root@fikc-naglsprod11 /]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 16G 0 16G 0% /dev
tmpfs 16G 220K 16G 1% /dev/shm
tmpfs 16G 1.7G 15G 11% /run
tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/mapper/centos-root 490G 410G 60G 88% / # ---------------------- > 500 GBs volume !!!
/dev/sda1 976M 284M 626M 32% /boot
fikc-isilon01.res.kcg.global:/ifs/data/fikc-nagxiprod01-backup 300G 241G 60G 81% /mnt/nfs/backup
tmpfs 3.2G 0 3.2G 0% /run/user/1000
tmpfs 3.2G 0 3.2G 0% /run/user/0
tmpfs 3.2G 0 3.2G 0% /run/user/48
[root@fikc-naglsprod11 /]#
Re: LOG server running out of space, but this is not Indexes
Did this output anything? Sometimes the files can be deleted but still consuming the space because something still has them open.
Continue down the path:
That's the only way I know how to find what exactly consuming the space.
Code: Select all
lsof | grep deletedCode: Select all
cd /usr
du -sh *
cd /usr/nextlargestone
du -sh *
etcRe: LOG server running out of space, but this is not Indexes
No deleted files, /usr contains 190 GBs of Elastic search.ssax wrote:Did this output anything? Sometimes the files can be deleted but still consuming the space because something still has them open.
Continue down the path:Code: Select all
lsof | grep deleted
That's the only way I know how to find what exactly consuming the space.Code: Select all
cd /usr du -sh * cd /usr/nextlargestone du -sh * etc
200 GBs are somewhere invisible
Re: LOG server running out of space, but this is not Indexes
Please create a ticket for this and include a link back to this forum thread so we can get a remote session setup:
https://support.nagios.com/tickets/
Thank you!
https://support.nagios.com/tickets/
Thank you!