Page 1 of 1

inode use maxed (300,000+) xidpe directory volume XI2014R2.0

Posted: Wed Feb 25, 2015 1:42 pm
by Pres-Gas
I seem to be having an issue with excessive inode usage in my trial version of NagiosXI 2014R2.0 (manually installed in a RHEL 6 64 bit VM) in the xidpe directory. My org is working on purchasing a full license, but I am concerned for the cause of this issue and how to fix it. I found similar forum entries from 2012:

http://support.nagios.com/forum/viewtop ... f=6&t=5179
and
http://support.nagios.com/forum/viewtop ... f=6&t=6242

...but I am not convinced that is the issue. The first link the poster did not confirm the fix and the second link all my cron jobs are also running.

I have narrowed it down in the following manner:

Code: Select all

[root@esnmon nagios]# find . -printf "%h\n" | cut -d/ -f-2 | sort | uniq -c | sort -rn
 382964 ./var
   1742 ./share
    126 ./libexec
     62 ./etc
     22 ./sbin
     11 ./bin
      8 .
[root@esnmon nagios]# cd var/
[root@esnmon var]# find . -printf "%h\n" | cut -d/ -f-2 | sort | uniq -c | sort -rn
 382880 ./spool
     70 ./archives
     12 .
      2 ./rw
      1 ./stats
[root@esnmon var]# cd spool/
[root@esnmon spool]# find . -printf "%h\n" | cut -d/ -f-2 | sort | uniq -c | sort -rn
 382877 ./xidpe
      4 .
[root@esnmon spool]# cd xidpe/
I also found the following in the logs for the cron/perfdataproc.php task:

Code: Select all

Outbound data DISABLED Wed, 25 Feb 2015 13:18:01 -0500
sh: /bin/mv: Argument list too long
sh: /bin/mv: Argument list too long
sh: /bin/mv: Argument list too long
sh: /bin/mv: Argument list too long

DONE. Processed 0 files.
/usr/local/nagiosxi/var/perfdataproc.log (END) 
I hope to know why this is happening. Can it really be because the license has expired?

Additionally, since I am getting the "Argument list too long" message, can I safely "rm" these files to get things going again?

Thanks and look forward to hearing from the community.

Re: inode use maxed (300,000+) xidpe directory volume XI2014

Posted: Wed Feb 25, 2015 1:52 pm
by abrist
This issue has a number of common causes. I would suggest that you fix the issue by removing the files in the directory - as the directory has too many files to stat(), you will need to remove them with find:

Code: Select all

cd /usr/local/nagios/var/spool/xidpe
find . -type f -delete
Afterwards, restart npcd:

Code: Select all

service npcd restart
This can be most commonly caused (among many other potentials causes) by the npcd process stopping unexpectedly, from the existence of multiple nagios parents, or by too small load/timeout thresholds for perfdata and npcd. For the latter case, you may want to follow the faq below to increase those values:
http://support.nagios.com/wiki/index.ph ... ta_Timeout
http://support.nagios.com/wiki/index.ph ... _Threshold

Re: inode use maxed (300,000+) xidpe directory volume XI2014

Posted: Thu Feb 26, 2015 11:06 am
by Pres-Gas
Thanks very much. I ran the commands and will monitor the reaping of those files. Finally, what have I lost since I deleted those files instead of somehow let the scripts process them properly; have we lost some data or is there more lost?

Re: inode use maxed (300,000+) xidpe directory volume XI2014

Posted: Thu Feb 26, 2015 11:28 am
by abrist
Pres-Gas wrote:Finally, what have I lost since I deleted those files instead of somehow let the scripts process them properly; have we lost some data or is there more lost?
You most likely have lost the performance data for the window of time in question. You should still have all state data in the ndo database/nagios logs so this should not effect availability reports etc.