Page 1 of 1

NRPE periodically runs plugin without parameter

Posted: Thu Dec 11, 2014 4:14 pm
by vvz
Hello!
I'm running nagios 3.5.1 servers on 2 boxes independently to check the same hosts in the same network via NRPE (2 nagios servers just this week for some testing, later on will be just one )
Nagios (both nagios-servers) checks wether service is running or not on a host using check_ps.sh plugin. This plugin has a parameter "-p name-of-the-service" and returns OK if service is running or CRITICAL if not.

in nrpe.cfg on the host I have a line
command[check_ps]=/usr/lib64/nagios/plugins/check_ps.sh -p smg
where smg - name of the service (process) I'm monitoring
on both nagios servers I vave the same lines in commands.cfg
define command{
command_name check_ps.sh
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c "check_ps" -t 300
}
on one nagios server everything is just fine. The service is down on the host for approximately 2 weeks and I have web-interface indication and alarms about, as it should be.
on the other nagios web-interface - the service goes up and down and the same is recorded in logs. Just a small example of logs underneath.
12-08-2014 20:45:11 12-08-2014 20:53:11 0d 0h 8m 0s SERVICE OK (HARD) OK - Process: smg, User: root, CPU: 7.0%, RAM: 0.0%, Start: 20:45, CPU Time: 0 min
12-08-2014 20:53:11 12-08-2014 20:54:11 0d 0h 1m 0s SERVICE CRITICAL (HARD) CRITICAL - Process is not running!
12-08-2014 20:54:11 12-08-2014 20:56:11 0d 0h 2m 0s SERVICE OK (HARD) OK - Process: smg, User: root, CPU: 3.0%, RAM: 0.0%, Start: 20:54, CPU Time: 0 min
12-08-2014 20:56:11 12-08-2014 21:36:11 0d 0h 40m 0s SERVICE CRITICAL (HARD) CRITICAL - Process is not running!
12-08-2014 21:36:11 12-08-2014 21:43:11 0d 0h 7m 0s SERVICE OK (HARD) OK - Process: smg, User: root, CPU: 12.0%, RAM: 0.0%, Start: 21:36, CPU Time: 0 min
12-08-2014 21:43:11 12-08-2014 21:44:11 0d 0h 1m 0s SERVICE CRITICAL (HARD) CRITICAL - Process is not running!
12-08-2014 21:44:11 12-08-2014 21:46:11 0d 0h 2m 0s SERVICE OK (HARD) OK - Process: smg, User: root, CPU: 3.0%, RAM: 0.0%, Start: 21:44, CPU Time: 0 min
12-08-2014 21:46:11 12-08-2014 22:24:11 0d 0h 38m 0s SERVICE CRITICAL (HARD) CRITICAL - Process is not running!
I have figured out that if I call ./check_ps.sh in CLI without any parameters it always returns OK status, exactly as you can see in logs.
If I call it with parameter, it gives me the right return - Process is not running.

I assume that periodically NRPE calls script without parameter and do not understand why. And it's strange that it's true for requests coming from one particular nagios-server
Can I check this assumption somehow ? Any ideas about?

Thank you

Re: NRPE periodically runs plugin without parameter

Posted: Thu Dec 11, 2014 5:28 pm
by Box293
On the remote server with the issue can you enable debug logging in the nrpe config. Restart nrpe and then the results should end up in /var/log/messages

Re: NRPE periodically runs plugin without parameter

Posted: Thu Dec 11, 2014 6:09 pm
by vvz
Thank you, Box293!
Let me add my notes for people who may read these posts.
It's not enough to add debug=1 in nrpe.cfg (at least on CentOS 6.4)
to see debugging messages, do next
Edit /etc/rsyslog.conf
Find:
*.info;mail.none;authpriv.none;cron.none /var/log/messages
Change to:
*.info;mail.none;authpriv.none;cron.none;daemon.debug /var/log/messages

Now restart rsyslog and nrpe:

$ /etc/init.d/rsyslog restart
$ /etc/init.d/nrpe restart
we can close the tread, I believe

Re: NRPE periodically runs plugin without parameter

Posted: Fri Dec 12, 2014 10:12 am
by cmerchant
Thanks, we'll go ahead and close this thread. Thanks.