In a nutshell, for one reason or another, when the NCPD Daemon is terminated, the Nagios XI system status does not detect that this has occurred. To the admin looking at Nagios XI the system is OK "six ticks along the top bar".
I discovered this on a test XI VM I deployed recently. I added a windows server to test some new checks and when I came back a day later there were no performance graphs being generated.
I found that in the "/usr/local/nagios/var/spool/perfdata" folder there was about 80,000 files.
When I had a look at "/usr/local/nagios/var/npcd.log" I found the following line:
Code: Select all
[06-07-2013 15:28:16] NPCD: WARN: MAX load reached: load 12.210000/10.000000 at i=0[06-23-2013 08:59:08] NPCD: Caught Termination Signal - Hasta la vista... babyI understand why the NPCD Daemon was terminated, this is not a discussion about max loads etc.
From what I can determine, when the NPCD Daemon is terminated, and the spool/perfdata files are building up, there is no Monitoring Engine Status / System status check / dashlet that identifies there is a problem with NPCD.
So my suggestion is, perhaps in the XI System Component Status, or the System OK status (or somewhere else) should include a check to alert the admin when the NPCD Daemon is terminated / spool/perfdata files are building up.
I could be wrong though, but this is just some observed behaviour that I've finally been able to pinpoint.