Page 1 of 1
check_by_ssh yields invalid status code, works manually
Posted: Mon Jun 10, 2013 1:20 pm
by rks
Hi -
I am using check_by_ssh to check the load on a remote (man aged) host using the custom command:
Code: Select all
define command {
command_name check_ctv_apache_server
command_line $USER1$/check_by_ssh -l nagios -i /etc/nagios/.ssh/id_dsa -H $HOSTADDRESS$ -C "/opt/foo/nagios/libexec/check_load -w 80,85,90 -c 91,95,99" -E
}
The check works manually:
Code: Select all
# /usr/local/nagios/libexec/check_by_ssh -l nagios -i /etc/nagios/.ssh/id_dsa -H vns-iweprd-1 -C "/opt/jabber/nagios/libexec/check_load -w 80,85,90 -c 91,95,99" -E
OK - load average: 0.06, 0.02, 0.04|load1=0.060;80.000;91.000;0; load5=0.020;85.000;95.000;0; load15=0.040;90.000;99.000;0;
#
But on the service dashboard the service is shown CRITICAL with the message:
Screen Shot 2013-06-10 at 11.15.40 AM.png
Is there a way to turn on debugs to troubleshoot this? What is a common cause of such a problem?
thanks,
Re: check_by_ssh yields invalid status code, works manually
Posted: Mon Jun 10, 2013 1:52 pm
by abrist
After you run teh check manually, what is the output of:
Re: check_by_ssh yields invalid status code, works manually
Posted: Fri Jun 14, 2013 4:05 pm
by rks
Re: check_by_ssh yields invalid status code, works manually
Posted: Mon Jun 17, 2013 10:42 am
by abrist
So the check is exiting correctly. You may need to escape the double quotes so teh shell does not parse the quotes until it has passed through ssh:
Code: Select all
command_line $USER1$/check_by_ssh -l nagios -i /etc/nagios/.ssh/id_dsa -H $HOSTADDRESS$ -C \"/opt/foo/nagios/libexec/check_load -w 80,85,90 -c 91,95,99\" -E
Re: check_by_ssh yields invalid status code, works manually
Posted: Wed Jun 19, 2013 1:52 pm
by rks
Thanks much for your help. This is what I have in commands:
Code: Select all
define command {
command_name check_ctv_apache_server
command_line $USER1$/check_by_ssh -l johndoe -i /etc/nagios/.ssh/id_dsa -H $HOSTADDRESS$ -C \"/opt/jabber/nagios/libexec/check_load -w 80,85,90 -c 91,95,99\" -E
}
define command {
command_name check_ctv_server_swap
command_line $USER1$/check_by_ssh -l johndoe -i /etc/nagios/.ssh/id_dsa -H $HOSTADDRESS$ -C \"/opt/jabber/nagios/libexec/check_swap -w 80,85,90 -c 91,95,99\" -E
}
But I now get a Usage error:
Code: Select all
# /usr/local/nagios/var/nagios.log
...
1371667112] SERVICE ALERT: vns-iweprd-2;Swap;UNKNOWN;HARD;4;Usage:
[1371667112] SERVICE NOTIFICATION: ctvadmin;vns-iweprd-2;Swap;UNKNOWN;notify-service-by-email;Usage:
...
Is there a way to "rehearse" the command as in run it exactly as Nagios would run it and view the output?
thanks,
Re: check_by_ssh yields invalid status code, works manually
Posted: Wed Jun 19, 2013 4:52 pm
by lmiltchev
You can run in terminal:
Code: Select all
su - nagios -c '<full path to your command>'
Re: check_by_ssh yields invalid status code, works manually
Posted: Fri Jun 21, 2013 5:38 pm
by rks
Hi -
I am out of ideas on this: I get $?=3 in one case and $?=255 in another. I may then have no choice but to
hardcode the command with parameters in a script on the target host and invoke that script using check_by_ssh.
I hate to do that as I would then not be able to pass the -w and -c parameters from Nagios console.
Code: Select all
[root@nagiosconsole ~]# su - nagios -c '/usr/local/nagios/libexec/check_by_ssh -l nagios -i /etc/nagios/.ssh/id_dsa -H vns-iweprd-1 -C \"/opt/jabber/nagios/libexec/check_swap -w 70% -c 60%\" -E'
/usr/local/nagios/libexec/check_by_ssh: invalid option -- 'w'
Usage:
check_by_ssh -H <host> -C <command> [-fqv] [-1|-2] [-4|-6]
[-S [lines]] [-E [lines]] [-t timeout] [-i identity]
[-l user] [-n name] [-s servicelist] [-O outputfile]
[-p port] [-o ssh-option] [-F configfile]
[root@nagiosconsole ~]# echo $?
3
[root@nagiosconsole ~]# su - nagios -c '/usr/local/nagios/libexec/check_by_ssh -l nagios -i /etc/nagios/.ssh/id_dsa -H vns-iweprd-1 -C "/opt/jabber/nagios/libexec/check_swap -w 70% -c 60%" -E'
UNKNOWN - check_by_ssh: Remote command '/opt/jabber/nagios/libexec/check_swap -w 70% -c 60%' returned status 255
[root@nagiosconsole ~]# echo $?
255
[root@nagiosconsole ~]#
Any suggestions greatly appreciated!
thanks,
Re: check_by_ssh yields invalid status code, works manually
Posted: Mon Jun 24, 2013 11:04 am
by abrist
What happens when you run:
Code: Select all
/opt/jabber/nagios/libexec/check_swap -w 70% -c 60%
echo $?
Directly on the remote host?
Re: check_by_ssh yields invalid status code, works manually
Posted: Fri Jun 28, 2013 2:21 pm
by rks
On the monitored host, it works without a flaw:
Code: Select all
[rks@vns-iweprd-1 ~]$ /opt/jabber/nagios/libexec/check_swap -w 70% -c 60%
SWAP OK - 100% free (4094 MB out of 4094 MB) |swap=4094MB;2866;2456;0;4094
[rks@vns-iweprd-1 ~]$ echo $?
0
[rks@vns-iweprd-1 ~]$
Here is some additional data:
If I run it as root, it works fine (on my nagios manager):
Code: Select all
[root@ctvmanager ~]# who am i
root pts/0 2013-06-28 15:06 (sjc-rks-8916.cisco.com)
[root@ctvmanager ~]#
root@ctvmanager ~]# /usr/local/nagios/libexec/check_by_ssh -l rks -i /etc/nagios/.ssh/id_dsa -H vns-iweprd-1 -C /users/rks/check_load -E
OK - load average: 0.02, 0.04, 0.01|load1=0.020;75.000;90.000;0; load5=0.040;80.000;95.000;0; load15=0.010;85.000;99.000;0;
[root@ctvmanager ~]# echo $?
0
[root@ctvmanager ~]#
If I run it the same command as nagios user, I get a 255:
Code: Select all
[root@ctvmanager ~]# su - nagios -c '/usr/local/nagios/libexec/check_by_ssh -l rks -i /etc/nagios/.ssh/id_dsa -H vns-iweprd-1 -C /users/rks/check_load -E'
UNKNOWN - check_by_ssh: Remote command '/users/rks/check_load' returned status 255
[root@ctvmanager ~]# echo $?
255
[root@ctvmanager ~]#
If I leave out the -E option, I get a 'host key verification failure':
Code: Select all
[root@ctvmanager ~]# su - nagios -c '/usr/local/nagios/libexec/check_by_ssh -l rks -i /etc/nagios/.ssh/id_dsa -H vns-iweprd-1 -C /users/rks/check_load'
Remote command execution failed: Host key verification failed.
[root@ctvmanager ~]# echo $?
3
[root@ctvmanager ~]# su - nagios -c '/usr/local/nagios/libexec/check_by_ssh -v -l rks -i /etc/nagios/.ssh/id_dsa -H vns-iweprd-1 -C /users/rks/check_load -E'
Command: /usr/bin/ssh
Argument 1: -l
Argument 2: rks
Argument 3: -i
Argument 4: /etc/nagios/.ssh/id_dsa
Argument 5: vns-iweprd-1
Argument 6: /users/rks/check_load
UNKNOWN - check_by_ssh: Remote command '/users/rks/check_load' returned status 255
[root@ctvmanager ~]# echo $?
255
[root@ctvmanager ~]#
Could I, to sidestep this problem, run Nagios services as root? Would it cause any problems, outside of the security
vulnerabilities caused by running a services as root?
thanks,
Re: check_by_ssh yields invalid status code, works manually
Posted: Mon Jul 01, 2013 10:53 am
by abrist
What are the permissions on:
Code: Select all
ls -la /users/rks/check_load
ls -la /users/