Page 1 of 6

Check_by_ssh issue

Posted: Thu Aug 18, 2016 3:09 am
by MarMottE
Hello,

I have installed a nagios IX server and everything it's working except on check : check_by_ssh.

When i'm connected on the server by ssh or console, the check_by_ssh are up no problem, but when I'm closing my ssh connection those check are down after 2 minutes ...

Error messages are :

Return code of 255 is out of bounds
or
remote command execution failed : warning : identity file /home/nagios/.ssh.id_dsa not accessible : no such file or directory

I have try to find any solutions :

add the option -i with the path of private key : same problem.
add the option -E at the end of the command : same problem.

I have installed my server on a ubuntu 14.04 LTS.

Do you have any ideas ?

Thanks in advance

Re: Check_by_ssh issue

Posted: Thu Aug 18, 2016 9:58 am
by rkennedy
Take a look at this document, which should have all of the information for getting check_by_ssh working - https://assets.nagios.com/downloads/nag ... ng_SSH.pdf

Re: Check_by_ssh issue

Posted: Fri Aug 19, 2016 12:13 am
by MarMottE
Hello

Thanks for your link, but I have correctly configured ssh and copy public key on my remote server.

And my check are working when i'm connected on my nags server, but when i'm disconnected the check are down after 2 minutes

Re: Check_by_ssh issue

Posted: Fri Aug 19, 2016 11:24 am
by lmiltchev
Can you show us the config of the "problem" service and the "check_by_ssh" command (if you modified it in any way)?

Also, run the following command and show the output:

On the Nagios XI server:

Code: Select all

su nagios
ssh nagios@remoteip
On the client (remote machine):

Code: Select all

ls -la /home/nagios/.ssh

Re: Check_by_ssh issue

Posted: Mon Aug 22, 2016 3:28 am
by MarMottE
Hello,

Please follow the cmd

On nagios IX

root@:/home/nagios# su - nagios
nagios@:~$ ssh nagios@xxxxxx
Last login: Fri Aug 19 23:08:02 2016 from xxxxxx
[Expert@:0]#



On remote server

[Expert@:0]# ls -la /home/nagios/.ssh/
total 12
drwx------ 2 admin root 4096 Aug 8 09:22 .
drwx------ 3 admin root 4096 Aug 19 22:14 ..
-rw------- 1 admin root 605 Jul 18 15:38 authorized_keys


The user nagios is in the admin grp



==========================


Check_by_ssh




Actually i'm connected on the nagios server and my check are up

define command{
command_name cpu_load_fw
command_line /usr/local/nagios/libexec/check_by_ssh -i /home/nagios/.ssh/id_dsa -H '$HOSTADDRESS$' -C "checkpoint_cpu $ARG1$ $ARG2$" -E
}

xxxxxx
CPU Load
Perform Extra Service Actions
OK 08-22-2016 10:24:29 0d 0h 3m 50s 1/3 CPU is OK - 1%

But if i close my connection, the check goes down

Re: Check_by_ssh issue

Posted: Mon Aug 22, 2016 10:44 am
by rkennedy
What OS is this machine running? [Expert@:0] Is the storage persistent?

Re: Check_by_ssh issue

Posted: Mon Aug 22, 2016 10:52 am
by lmiltchev
You clearly deviated from our official documentation.
[Expert@:0]# ls -la /home/nagios/.ssh/
total 12
drwx------ 2 admin root 4096 Aug 8 09:22 .
drwx------ 3 admin root 4096 Aug 19 22:14 ..
-rw------- 1 admin root 605 Jul 18 15:38 authorized_keys
You are missing files, permissions are wrong... Here's what I see on a "working" system:

Code: Select all

[root@192 .ssh]# ls -la /home/nagios/.ssh/
total 20
dr-xr-xr-x  2 nagios nagios 4096 Nov  9  2015 .
dr-xr-xr-x. 5 nagios nagios 4096 Nov  9  2015 ..
-rw-------  1 nagios nagios  410 Apr 13 10:49 authorized_keys
-rw-------  1 nagios nagios 1671 Nov  9  2015 id_rsa
-rw-r--r--  1 nagios nagios  410 Nov  9  2015 id_rsa.pub
Revisit our document here:
https://assets.nagios.com/downloads/nag ... ng_SSH.pdf
Make sure you follow it closely.

Re: Check_by_ssh issue

Posted: Tue Aug 23, 2016 4:41 am
by MarMottE
rkennedy wrote:What OS is this machine running? [Expert@:0] Is the storage persistent?
It's a checkpoint firewall

lmiltchev wrote:You clearly deviated from our official documentation.
[Expert@:0]# ls -la /home/nagios/.ssh/
total 12
drwx------ 2 admin root 4096 Aug 8 09:22 .
drwx------ 3 admin root 4096 Aug 19 22:14 ..
-rw------- 1 admin root 605 Jul 18 15:38 authorized_keys
You are missing files, permissions are wrong... Here's what I see on a "working" system:

Code: Select all

[root@192 .ssh]# ls -la /home/nagios/.ssh/
total 20
dr-xr-xr-x  2 nagios nagios 4096 Nov  9  2015 .
dr-xr-xr-x. 5 nagios nagios 4096 Nov  9  2015 ..
-rw-------  1 nagios nagios  410 Apr 13 10:49 authorized_keys
-rw-------  1 nagios nagios 1671 Nov  9  2015 id_rsa
-rw-r--r--  1 nagios nagios  410 Nov  9  2015 id_rsa.pub
Revisit our document here:
https://assets.nagios.com/downloads/nag ... ng_SSH.pdf
Make sure you follow it closely.

I can't change the owner for the directory on the checkpoint firewall, but the user nagios is member to the group root and he has the admin right.
I know it's strange but without this configuration the ssh connexion not working with the user nagios.

And I think it's working becuase my check are up when i'm connected on the nagios server by ssh or console but when I close my session the check goes down ..

Re: Check_by_ssh issue

Posted: Tue Aug 23, 2016 11:48 am
by rkennedy
Is the storage persistent? It sounds like it may not be, or perhaps this device handles SSH sessions a certain way.

To rule out permissions, what is the output of ls -l /home/ and ls -l /home/nagios/?

Re: Check_by_ssh issue

Posted: Wed Aug 24, 2016 1:37 am
by MarMottE
Hello,

I have the same problem on a linux server, maybe can we focus on it first

From nagios IX

nagios@nagiosserver:~$ ssh nagios@xxxx
Welcome to Ubuntu 14.04.4 LTS (GNU/Linux 4.2.0-42-generic x86_64)

* Documentation: https://help.ubuntu.com/

System information as of Wed Aug 24 08:27:18 CEST 2016

System load: 0.17 Processes: 180
Usage of /: 34.6% of 12.39GB Users logged in: 0
Memory usage: 30% IP address for eth0: 10.10.29.34
Swap usage: 5%

Graph this data and manage this system at:
https://landscape.canonical.com/

18 packages can be updated.
9 updates are security updates.

New release '16.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.
nagios@:~$


From the remote server

nagios@remoteserver:~$ ls -la /home/nagios/.ssh
total 12
drwx------ 2 nagios nagios 4096 May 3 16:36 .
drwxr-xr-x 4 nagios nagios 4096 May 3 16:37 ..
-rw------- 1 nagios nagios 604 May 3 16:36 authorized_keys


The check_by_ssh

# 'ssh_disk' command definition
define command{
command_name ssh_disk
command_line /usr/local/nagios/libexec/check_by_ssh -i /home/nagios/.ssh/id_dsa -H '$HOSTADDRESS$' -E 1 -C "/usr/lib/nagios/plugins/check_disk -w $ARG1$ -c $ARG2$ -e -p '$ARG3$'"
}


nagios@remoteserver:~$ ls -la /usr/lib/nagios/plugins/check_disk
-rwxr-xr-x 1 root root 60568 Mar 12 2014 /usr/lib/nagios/plugins/check_disk


Actually the check is up because i'm connected on my server

/
Perform Extra Service Actions
OK 08-24-2016 08:35:57 0d 0h 9m 24s 1/3 DISK OK

==============================


rkennedy wrote:Is the storage persistent? It sounds like it may not be, or perhaps this device handles SSH sessions a certain way.

To rule out permissions, what is the output of ls -l /home/ and ls -l /home/nagios/?

nagios@remoteserver:~$ ls -l /home/
total
drwxr-xr-x 4 nagios nagios 4096 May 3 16:37 nagios


nagios@remoteserver:~$ ls -la /home/nagios/
total 20
drwxr-xr-x 4 nagios nagios 4096 May 3 16:37 .
drwxr-xr-x 5 root root 4096 May 3 16:35 ..
-rw------- 1 nagios nagios 46 Jul 22 17:57 .bash_history
drwx------ 2 nagios nagios 4096 May 3 16:37 .cache
drwx------ 2 nagios nagios 4096 May 3 16:36 .ssh