Page 1 of 2
check_by_ssh issue
Posted: Fri Jul 05, 2013 3:27 pm
by yaoyao
We use check_by_ssh to check load and swap usage on remote solaris servers (local zones). I could run it manually successfully from nagios server:
$ ./check_by_ssh -H terra -C "/usr/local/nagios/libexec/check_load -w 8.0,8.0,8.0 -c 16.0,16.0,16.0"
OK - load average: 0.19, 0.23, 0.18|load1=0.191;8.000;16.000;0; load5=0.230;8.000;16.000;0; load15=0.184;8.000;16.000;0;
However, on nagios GUI, it keeps giving me "Remote command execution failed: ld.so.1: ssh: fatal: relocation error: file /usr/bin/ssh: symbol SUNWcry_installed: referenced symbol not found".
What did I do wrong?
Command.cfg shows:
# 'check_ssh_load' command definition
define command {
command_name check_ssh_load
command_line $USER1$/check_by_ssh -H $HOSTADDRESS$ -C "/usr/local/nagios/libexec/check_load -w $ARG1$ -c $ARG2$"
}
service is defined as:
define service{
use generic-service ; Name of service template to use
host_name terra
service_description Current Load
check_command check_ssh_load!8.0,8.0,8.0!16.0,16.0,16.0
}
Thanks for your help.
Re: check_by_ssh issue
Posted: Mon Jul 08, 2013 1:45 pm
by abrist
Are you testing this check as user 'nagios'?
Re: check_by_ssh issue
Posted: Mon Jul 08, 2013 2:01 pm
by yaoyao
I did test with nagios user. It works fine if I run it manually, however, it fails when it runs via Nagios core. What might be the issue?
BTW, is there any delay in posting? I post the reply earlier, but I didnt see it after 10min. I'll see if this one goes through.
Re: check_by_ssh issue
Posted: Mon Jul 08, 2013 2:31 pm
by lmiltchev
The posts don't appear right ways because they need to be moderated. It may take 10 - 15 min. before we can get to a post on a busy day.
You had two almost identical posts, so I approved the second one (the more "detailed" one). Try not to "double post". Thanks!
Re: check_by_ssh issue
Posted: Mon Jul 08, 2013 2:36 pm
by abrist
You may have to specify the full path to bash on the solaris server:
And then prepend that to the Command (for example, if it was located at /bin/bash):
Code: Select all
$USER1$/check_by_ssh -H $HOSTADDRESS$ -C "/bin/bash /usr/local/nagios/libexec/check_load -w $ARG1$ -c $ARG2$"
Re: check_by_ssh issue
Posted: Mon Jul 08, 2013 3:05 pm
by yaoyao
Still not working.
define command {
command_name check_ssh_load
command_line $USER1$/check_by_ssh -H $HOSTADDRESS$ -C "/usr/bin/bash
/usr/local/nagios/libexec/check_load -w $ARG1$ -c $ARG2$"
}
Re: check_by_ssh issue
Posted: Mon Jul 08, 2013 3:19 pm
by abrist
Well the quick and dirty did not work. It is very odd that you are getting a linking error only when run from core, but not from the cli. Lets check the linking on both the nagios server and the solaris box:
Re: check_by_ssh issue
Posted: Mon Jul 08, 2013 3:34 pm
by yaoyao
On Nagios server (solaris 10 zone)
$ ldd /usr/bin/ssh
libsocket.so.1 => /usr/lib/libsocket.so.1
libnsl.so.1 => /usr/lib/libnsl.so.1
libz.so.1 => /opt/csw/lib/libz.so.1
libz.so.1 (SUNW_1.1) => (version not found)
libcrypto.so.0.9.7 => /usr/sfw/lib/libcrypto.so.0.9.7
libgss.so.1 => /usr/lib/libgss.so.1
libc.so.1 => /usr/lib/libc.so.1
libmp.so.2 => /usr/lib/libmp.so.2
libmd.so.1 => /usr/lib/libmd.so.1
libscf.so.1 => /usr/lib/libscf.so.1
libcmd.so.1 => /usr/lib/libcmd.so.1
libdoor.so.1 => /usr/lib/libdoor.so.1
libuutil.so.1 => /usr/lib/libuutil.so.1
libgen.so.1 => /usr/lib/libgen.so.1
libcrypto_extra.so.0.9.7 => /usr/sfw/lib/libcrypto_extra.so.0.9.7
libm.so.2 => /usr/lib/libm.so.2
/platform/SUNW,Sun-Fire-V440/lib/libc_psr.so.1
/platform/SUNW,Sun-Fire-V440/lib/libmd_psr.so.1
On client (I've tried on solaris 8, 9 and 10 zones. Same issue: works on cli, but didnt work on core)
Solaris 8 zone:
$ ldd /usr/bin/ssh
/usr/lib/secure/s8_preload.so.1
libresolv.so.2 => /usr/lib/libresolv.so.2
libcrypto.so.0.9.8 => /usr/local/ssl/lib/libcrypto.so.0.9.8
librt.so.1 => /usr/lib/librt.so.1
libz.so => /usr/lib/libz.so
libsocket.so.1 => /usr/lib/libsocket.so.1
libnsl.so.1 => /usr/lib/libnsl.so.1
libc.so.1 => /usr/lib/libc.so.1
libdl.so.1 => /usr/lib/libdl.so.1
libgcc_s.so.1 => /usr/local/lib/libgcc_s.so.1
libaio.so.1 => /usr/lib/libaio.so.1
libmp.so.2 => /usr/lib/libmp.so.2
/usr/platform/sun4v/lib/libc_psr.so.1
Solaris 9 zone:
$ ldd /usr/bin/ssh
/usr/lib/secure/s9_preload.so.1
libsocket.so.1 => /lib/libsocket.so.1
libnsl.so.1 => /lib/libnsl.so.1
libz.so.1 => /lib/libz.so.1
libmd5.so.1 => /lib/libmd5.so.1
libgss.so.1 => /lib/libgss.so.1
libc.so.1 => /lib/libc.so.1
libdl.so.1 => /lib/libdl.so.1
libmp.so.2 => /lib/libmp.so.2
libxfn.so.2 => /lib/libxfn.so.2
libcmd.so.1 => /lib/libcmd.so.1
/usr/platform/sun4v/lib/libc_psr.so.1
Solaris 10 zone:
$ ldd /usr/bin/ssh
libsocket.so.1 => /lib/libsocket.so.1
libnsl.so.1 => /lib/libnsl.so.1
libz.so.1 => /usr/lib/libz.so.1
libcrypto.so.0.9.7 => /usr/sfw/lib/libcrypto.so.0.9.7
libgss.so.1 => /usr/lib/libgss.so.1
libc.so.1 => /lib/libc.so.1
libmp.so.2 => /lib/libmp.so.2
libmd.so.1 => /lib/libmd.so.1
libscf.so.1 => /lib/libscf.so.1
libcmd.so.1 => /lib/libcmd.so.1
libdoor.so.1 => /lib/libdoor.so.1
libuutil.so.1 => /lib/libuutil.so.1
libgen.so.1 => /lib/libgen.so.1
libcrypto_extra.so.0.9.7 => /usr/sfw/lib/libcrypto_extra.so.0.9.7
libm.so.2 => /lib/libm.so.2
/lib/libm/libm_hwcap1.so.2
/platform/sun4v/lib/libc_psr.so.1
/platform/sun4v/lib/libmd_psr.so.1
Re: check_by_ssh issue
Posted: Mon Jul 08, 2013 4:53 pm
by sreinhardt
Are you getting this error on all three of the machines or just the one (nagios?) where these lines were found from the ldd command? Just to be certain, your nagios server is also solaris correct?
libz.so.1 => /opt/csw/lib/libz.so.1
libz.so.1 (SUNW_1.1) => (version not found)
Re: check_by_ssh issue
Posted: Tue Jul 09, 2013 8:22 am
by yaoyao
Nagios server is solaris 10 zone. ldd output is on server and three clients.
check_by_ssh commands works fine on command line on nagios server to all three clients, however, when it runs from core, it gives the error.