NRPE periodically runs plugin without parameter
Posted: Thu Dec 11, 2014 4:14 pm
Hello!
I'm running nagios 3.5.1 servers on 2 boxes independently to check the same hosts in the same network via NRPE (2 nagios servers just this week for some testing, later on will be just one )
Nagios (both nagios-servers) checks wether service is running or not on a host using check_ps.sh plugin. This plugin has a parameter "-p name-of-the-service" and returns OK if service is running or CRITICAL if not.
in nrpe.cfg on the host I have a line
on the other nagios web-interface - the service goes up and down and the same is recorded in logs. Just a small example of logs underneath.
If I call it with parameter, it gives me the right return - Process is not running.
I assume that periodically NRPE calls script without parameter and do not understand why. And it's strange that it's true for requests coming from one particular nagios-server
Can I check this assumption somehow ? Any ideas about?
Thank you
I'm running nagios 3.5.1 servers on 2 boxes independently to check the same hosts in the same network via NRPE (2 nagios servers just this week for some testing, later on will be just one )
Nagios (both nagios-servers) checks wether service is running or not on a host using check_ps.sh plugin. This plugin has a parameter "-p name-of-the-service" and returns OK if service is running or CRITICAL if not.
in nrpe.cfg on the host I have a line
on both nagios servers I vave the same lines in commands.cfgcommand[check_ps]=/usr/lib64/nagios/plugins/check_ps.sh -p smg
where smg - name of the service (process) I'm monitoring
on one nagios server everything is just fine. The service is down on the host for approximately 2 weeks and I have web-interface indication and alarms about, as it should be.define command{
command_name check_ps.sh
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c "check_ps" -t 300
}
on the other nagios web-interface - the service goes up and down and the same is recorded in logs. Just a small example of logs underneath.
I have figured out that if I call ./check_ps.sh in CLI without any parameters it always returns OK status, exactly as you can see in logs.12-08-2014 20:45:11 12-08-2014 20:53:11 0d 0h 8m 0s SERVICE OK (HARD) OK - Process: smg, User: root, CPU: 7.0%, RAM: 0.0%, Start: 20:45, CPU Time: 0 min
12-08-2014 20:53:11 12-08-2014 20:54:11 0d 0h 1m 0s SERVICE CRITICAL (HARD) CRITICAL - Process is not running!
12-08-2014 20:54:11 12-08-2014 20:56:11 0d 0h 2m 0s SERVICE OK (HARD) OK - Process: smg, User: root, CPU: 3.0%, RAM: 0.0%, Start: 20:54, CPU Time: 0 min
12-08-2014 20:56:11 12-08-2014 21:36:11 0d 0h 40m 0s SERVICE CRITICAL (HARD) CRITICAL - Process is not running!
12-08-2014 21:36:11 12-08-2014 21:43:11 0d 0h 7m 0s SERVICE OK (HARD) OK - Process: smg, User: root, CPU: 12.0%, RAM: 0.0%, Start: 21:36, CPU Time: 0 min
12-08-2014 21:43:11 12-08-2014 21:44:11 0d 0h 1m 0s SERVICE CRITICAL (HARD) CRITICAL - Process is not running!
12-08-2014 21:44:11 12-08-2014 21:46:11 0d 0h 2m 0s SERVICE OK (HARD) OK - Process: smg, User: root, CPU: 3.0%, RAM: 0.0%, Start: 21:44, CPU Time: 0 min
12-08-2014 21:46:11 12-08-2014 22:24:11 0d 0h 38m 0s SERVICE CRITICAL (HARD) CRITICAL - Process is not running!
If I call it with parameter, it gives me the right return - Process is not running.
I assume that periodically NRPE calls script without parameter and do not understand why. And it's strange that it's true for requests coming from one particular nagios-server
Can I check this assumption somehow ? Any ideas about?
Thank you