NPCD System Status Issue

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
User avatar
rseiwert
Posts: 196
Joined: Wed Jun 22, 2011 10:33 pm
Location: Somewhere between Here and Now

NPCD System Status Issue

Post by rseiwert »

While this might seem a lot like some of my other recent topics it is something completely different. Last night I noticed my graphs had stopped updating. I did check and npcd is reported as running. Further investigation showed it was not. Just yet another issue where XI system status is not providing true updates.
ImageImage

Checking from the command line
[root@nagios nagios]# /etc/init.d/npcd status
NPCD running (pid 1519).
[root@nagios nagios]# ps -ef | grep 1519 | grep -v grep
root 1519 1 0 Apr21 ? 00:00:02 crond
root 64932 1519 0 00:23 ? 00:00:00 CROND
root 64933 1519 0 00:23 ? 00:00:00 CROND
root 64934 1519 0 00:23 ? 00:00:00 CROND
root 64935 1519 0 00:23 ? 00:00:00 CROND
root 64936 1519 0 00:23 ? 00:00:00 CROND

Of course after clicking the gear and restarting npcd you can guess what happened next. Cron jobs stopped running. Of course all nagios cron processes stopped at that point.
[root@nagios nagios]# ps -ef | grep crond | grep -v grep
[root@nagios nagios]#

Yet another time where sysstat.php (what drives those green checks) reported bogus info and where the XI interface killed off critical system components because it looked at a PID in a file without bothering to check if it really was that process. These system health indicators need to do more than to trust the init script. Improving the init.d script is the first step but if there if stale performance data queuing up and not being processed maybe the performance grapher is not running and doesn't deserve a green check mark.
Grumpy Olde IT Guy
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: NPCD System Status Issue

Post by jdalrymple »

You're right, obviously the method we're using (calling the init script) is inadequate to get valid data on the service. We'll either have to update the init scripts ourselves or have the poller work around it.

I'll share your situation with the devs and file a bug.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: NPCD System Status Issue

Post by Box293 »

I made some feature requests about providing some more "localhost" service checks to detect problems like spooled check results building up. Feel free to try them out and if you think they could be useful comment on them in tracker.

http://tracker.nagios.com/view.php?id=635
http://tracker.nagios.com/view.php?id=636
http://tracker.nagios.com/view.php?id=641
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked