Page 1 of 1

check_by_ssh question

Posted: Mon Jan 09, 2017 10:47 am
by cbeattie-unitrends
Hello,

I'm trying to run a service check on a remote host using check_by_ssh. I have copied the host keys for passwordless ssh. The remote host does not have a nagios user, but the -l option does what I need it to, mostly.

The check works if I run it from my Nagios XI host while logged in as nagios:

Code: Select all

[nagios@lal-nagios libexec]$ whoami
nagios
[nagios@lal-nagios libexec]$ ./check_by_ssh -H lal-01 -C '/usr/local/nagios/libexec/check_ro_mounts.sh' -l root ; echo $?
No read-only filesystems
0
[nagios@lal-nagios libexec]$
The check works if I run it from Core Config Manager using Run Check Command in the service's config, too.
rofs_run_check_command.png
However, the service check in normal use reports that host key verification fails.
rofs_service_status_detail.png
Shouldn't both of those either work or not work at the same time? I feel like I'm overlooking something obvious.

Re: check_by_ssh question

Posted: Mon Jan 09, 2017 11:41 am
by dwhitfield
Let's do a couple of sanity checks first.

What's the output of /usr/local/nagios/libexec/check_by_ssh -H remoteip -C "uptime"

What's the output of ls -la /home/nagios?

Re: check_by_ssh question

Posted: Mon Jan 09, 2017 12:56 pm
by cbeattie-unitrends
check_by_ssh works from the command line when run by root and also when run by nagios with the -l option. I have the -l option defined in my service check.

Code: Select all

[root@lal-nagios ~]# /usr/local/nagios/libexec/check_by_ssh -H lal-01 -C "uptime"
 12:52:27 up 227 days, 23:12,  0 users,  load average: 2.42, 1.99, 1.90
[root@lal-nagios ~]# su - nagios
Last login: Mon Jan  9 12:47:49 EST 2017 on pts/0
[nagios@lal-nagios ~]$ /usr/local/nagios/libexec/check_by_ssh -H lal-01 -C "uptime" -l root
 12:52:53 up 227 days, 23:13,  0 users,  load average: 2.35, 2.01, 1.91
[nagios@lal-nagios ~]$ ls -la /home/nagios/
total 68
drwx------. 5 nagios nagios 4096 Dec 28 14:55 .
drwxr-xr-x. 3 root   root     19 Nov  8 11:17 ..
-rw-rw-r--  1 nagios nagios  518 Nov 28 15:33 add_backups_hosts.sh
-rw-rw-r--  1 nagios nagios  529 Nov 25 12:57 add_hosts.sh
-rw-rw-r--  1 nagios nagios  518 Nov 28 16:28 add_snmpv2c_hosts.sh
-rw-------  1 nagios nagios 3433 Jan  9 12:50 .bash_history
-rw-r--r--. 1 nagios nagios   18 Nov 20  2015 .bash_logout
-rw-r--r--. 1 nagios nagios  193 Nov 20  2015 .bash_profile
-rw-r--r--. 1 nagios nagios  231 Nov 20  2015 .bashrc
drwx------  3 nagios nagios   17 Dec  1 08:48 .config
-rw-r--r--  1 nagios nagios  206 Jan  9 12:52 cookie.txt
-rw-rw-r--  1 nagios nagios   22 Dec 28 14:55 file
-rw-rw-r--  1 nagios nagios  294 Nov 25 17:02 fix_snmpd_loop2.sh
-rw-rw-r--  1 nagios nagios  106 Nov 25 16:56 fix_snmpd_loop.sh
-rw-rw-r--  1 nagios nagios  214 Nov 25 16:50 fix_snmpd.sh
-rw-rw-r--  1 nagios nagios  103 Nov 30 16:45 loop_remove_hosts.sh
-rw-rw-r--  1 nagios nagios  103 Nov 16 17:08 loop.sh
drwxr-----. 3 nagios nagios   18 Nov  8 11:22 .pki
drwx------  2 nagios nagios   54 Jan  6 16:44 .ssh
-rw-rw-r--  1 nagios nagios  127 Nov 28 11:38 start_snmpd.sh
-rw-rw-r--  1 nagios nagios  103 Nov 25 12:52 start_vmtoolsd.sh
[nagios@lal-nagios ~]$

Re: check_by_ssh question

Posted: Mon Jan 09, 2017 2:36 pm
by dwhitfield
In the Service Status Detail, if you click on the tab a couple over called "Configure" and then "Re-configure this service," can you post a screenshot of what it says there? Thanks!

Re: check_by_ssh question

Posted: Mon Jan 09, 2017 3:29 pm
by cbeattie-unitrends
Sure thing! I have to view it in Core Config Manager because of advanced configuration, but this works just fine if I click on the Run Check Command at the bottom and give it the name of the remote host I've installed the script on.
rofs_service_management.png

Re: check_by_ssh question

Posted: Mon Jan 09, 2017 4:28 pm
by lmiltchev
You showed us that the check worked from the command line run as nagios user, but were you asked for the root's password when you ran it? Did you copy the nagios public key (/home/nagios/.ssh/id_rsa.pub) from the Nagios XI server into root's authorized key (/root/.ssh/authorized_keys) on the remote box?

I am not sure if I tested the same plugin (https://exchange.nagios.org/index.php?o ... &Itemid=74) but it seemed to work fine for me in the GUI.
example05.PNG

Re: check_by_ssh question

Posted: Tue Jan 10, 2017 9:37 am
by cbeattie-unitrends
1. No, I wasn't asked for root's password.
2. Yes, I did copy nagios' public key to the remote host's root's authorized_keys file.

Code: Select all

[nagios@lal-nagios ~]$ md5sum /home/nagios/.ssh/id_rsa.pub ; /usr/local/nagios/libexec/check_by_ssh -H lal-01 -C 'tail -1 /root/.ssh/authorized_keys | md5sum' -l root ; echo $?
de95b4eaf416e10eb6421c3336df5601  /home/nagios/.ssh/id_rsa.pub
de95b4eaf416e10eb6421c3336df5601  -
0
My plugin is similarly-named but different. The remote hosts don't have the necessary Perl modules installed to run that one. It's just a BASH script in this case.

Code: Select all

#!/bin/bash
romounts=$( awk '/ro,/ && !/tmpfs/' /proc/mounts )
if [ -z "$romounts" ]
then
    echo "No read-only filesystems"
    exit 0
else
    echo "$romounts"
    exit 2
fi
It works from the command line, and from Run Check Command in CCM, just not as the actual service check. I should also mention that this is using Nagios XI 5.3.3. I didn't see anything in the changelog for newer versions that would appear to address this, but I am building a clone of my 5.3.3 Nagios host as an upgrade exercise.

[SOLVED] check_by_ssh question

Posted: Tue Jan 10, 2017 3:24 pm
by cbeattie-unitrends
I figured my problem out. The /home/nagios/.ssh/known_hosts file has the short host names, but Nagios itself is configured with the FQDNs. That's why the checks from the CLI and CCM worked, but the running service check did not: it was considered a different host name when Nagios was doing the checks. I just have to add the hosts by FQDN to Nagios' known_hosts file.

Thanks for the help! You can lock this thread if you like.

Re: check_by_ssh question

Posted: Tue Jan 10, 2017 4:32 pm
by mcapra
Thanks for sharing your solution! Closing this as the issue is resolved.