ssh check works by command line but not in nagios gui
ssh check works by command line but not in nagios gui
hi,
I have a strange problem
i have a ssh check going to check a nfs file system
it works perfectly by command line
[nagios@prdnagxi03 ~]$ /usr/local/nagios/libexec/check_by_ssh -H prdstldb11.xxx.net -C "/usr/lib/nagios/plugins/check_disk -u GB -w 10% -c 5% -p /nfs/ora_log"
DISK OK - free space: /nfs/ora_log 9 GB (99% inode=99%);| /nfs/ora_log=0GB;9;9;0;10
It works in the setup part of the check but fails in the XI gui can you help me troubleshoot
I have a strange problem
i have a ssh check going to check a nfs file system
it works perfectly by command line
[nagios@prdnagxi03 ~]$ /usr/local/nagios/libexec/check_by_ssh -H prdstldb11.xxx.net -C "/usr/lib/nagios/plugins/check_disk -u GB -w 10% -c 5% -p /nfs/ora_log"
DISK OK - free space: /nfs/ora_log 9 GB (99% inode=99%);| /nfs/ora_log=0GB;9;9;0;10
It works in the setup part of the check but fails in the XI gui can you help me troubleshoot
You do not have the required permissions to view the files attached to this post.
Re: ssh check works by command line but not in nagios gui
Can you login to the XI server and go to the /usr/local/nagios/etc/services folder and get the service file that is defined for the prdstldb11.xxx.net host and post the service details here so we can view it?
I am thinking that the objects in the Core Config Manager are not in sync with the running config and that is causing the issue.
To fix that, login to the XI GUI and go to the Core Config Manager
Under "Tools", click "Write Config Files" or if you are running a newer versions of XI, The menu is called "Config File Management"
Click on the "Write" button, then the "Delete" button then click the "Write" button and then the "Verify" button.
If you get any errors, resolve them and click on the "Delete" button, "Write", "Verify" until all of the errors are resolved.
Click the Apply Configuration link and click the "Apply Configuration" button after ALL of the errors are resolved.
If the above does not work, run the following as root on the XI server and post the output.
And search the /usr/local/nagios/var/nagios.log file for the error and post it here.
Thanks.
I am thinking that the objects in the Core Config Manager are not in sync with the running config and that is causing the issue.
To fix that, login to the XI GUI and go to the Core Config Manager
Under "Tools", click "Write Config Files" or if you are running a newer versions of XI, The menu is called "Config File Management"
Click on the "Write" button, then the "Delete" button then click the "Write" button and then the "Verify" button.
If you get any errors, resolve them and click on the "Delete" button, "Write", "Verify" until all of the errors are resolved.
Click the Apply Configuration link and click the "Apply Configuration" button after ALL of the errors are resolved.
If the above does not work, run the following as root on the XI server and post the output.
Code: Select all
ps -ef |grep nagiosThanks.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: ssh check works by command line but not in nagios gui
I do everything by hostgroups, add a server pick the host group, all get the same check
I've attached the config and the ps
I did the delete and write configs several times , it didn't fix it
from the Nagios.log
[1588284544] SERVICE ALERT: prdstldb11.gt.net;SSH Check Oracle NFS disk space /nfs/ora_log;CRITICAL;SOFT;4;DISK CRITICAL - /nfs/ora_log is not accessible: No such file or directory
[1588284604] SERVICE ALERT: prdstldb11.gs.net;SSH Check Oracle NFS disk space /nfs/ora_log;CRITICAL;HARD;5;DISK CRITICAL - /nfs/ora_log is not accessible: No such file or directory
I've attached the config and the ps
I did the delete and write configs several times , it didn't fix it
from the Nagios.log
[1588284544] SERVICE ALERT: prdstldb11.gt.net;SSH Check Oracle NFS disk space /nfs/ora_log;CRITICAL;SOFT;4;DISK CRITICAL - /nfs/ora_log is not accessible: No such file or directory
[1588284604] SERVICE ALERT: prdstldb11.gs.net;SSH Check Oracle NFS disk space /nfs/ora_log;CRITICAL;HARD;5;DISK CRITICAL - /nfs/ora_log is not accessible: No such file or directory
You do not have the required permissions to view the files attached to this post.
Re: ssh check works by command line but not in nagios gui
Thanks for the data.
The screen capture has the host name redacted and the log entries show 2 different host names.
Can you login to those servers and verify that the NFS mount is there? Or are the names a typo?
Is the error constant or intermittent?
Try this, remove the host from the host group, apply the config and wait for a few minutes. Put the host back in the group, and apply the config, does that fix the issue?
The screen capture has the host name redacted and the log entries show 2 different host names.
Can you login to those servers and verify that the NFS mount is there? Or are the names a typo?
Is the error constant or intermittent?
Try this, remove the host from the host group, apply the config and wait for a few minutes. Put the host back in the group, and apply the config, does that fix the issue?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: ssh check works by command line but not in nagios gui
yes,
I did all that
I went to the DB host, and became Nagios, and went to nfs mount point and did a df -h . with no problem
I have to take it out of the host every time my tests fails
it is very strange, because in the XI gui, where I have the check set up, if I run command it works fine
but fails when Nagios runs the check from the XI gui that has the host
the pictures above
and it's only that one host
today, I cloned it's 3 sisters from it, and their check oracle logs size works fine , it's only that one machine, and it's only in the external gui, and not in the core config manager
I did all that
I went to the DB host, and became Nagios, and went to nfs mount point and did a df -h . with no problem
I have to take it out of the host every time my tests fails
it is very strange, because in the XI gui, where I have the check set up, if I run command it works fine
but fails when Nagios runs the check from the XI gui that has the host
the pictures above
and it's only that one host
today, I cloned it's 3 sisters from it, and their check oracle logs size works fine , it's only that one machine, and it's only in the external gui, and not in the core config manager
Re: ssh check works by command line but not in nagios gui
What version of XI are you running?
If you go to the Service Details for that check and go to the Advanced Tab, does it show that the check is running and updating every 5 minutes?
Lets rule out any corruption in the MYSQL database causing the issue.
Run these commands to stop the processes, clean and repair the SQL database and to restart the processes. Run them all as root.
Open up the Nagios status.dat file on the system and search for that check, is the command defined correctly?
The file would be here
or here if you have a RAM Disk.
Can you post the full section here?
If you go to the Service Details for that check and go to the Advanced Tab, does it show that the check is running and updating every 5 minutes?
Lets rule out any corruption in the MYSQL database causing the issue.
Run these commands to stop the processes, clean and repair the SQL database and to restart the processes. Run them all as root.
Code: Select all
service npcd stop
service nagios stop
service ndo2db stop
service crond stop
pkill -9 -u nagios
echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | mysql -u root -pnagiosxi nagiosxi
mysqlcheck -f -r -u root -pnagiosxi --all-databases --use-frm
service mysqld restart
rm -f /usr/local/nagios/var/rw/nagios.cmd
rm -f /usr/local/nagios/var/nagios.lock
rm -f /var/run/nagios.lock
rm -f /usr/local/nagios/var/ndo.sock
rm -f /usr/local/nagios/var/ndo2db.lock
rm -f /var/lib/mrtg/mrtg_l
rm -f /usr/local/nagiosxi/var/*.lock
for i in `ipcs -q | grep nagios |awk '{print $2}'`; do ipcrm -q $i; done
pkill python
service httpd restart
service ndo2db start
service nagios start
service npcd start
service crond startThe file would be here
Code: Select all
/usr/local/nagios/var/status.datCode: Select all
/var/nagiosramdisk/Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: ssh check works by command line but not in nagios gui
this failed
service mysqld restart
[root@prdnagxi03 ~]# service mysqld restart
Redirecting to /bin/systemctl restart mysqld.service
Failed to restart mysqld.service: Unit not found.
service mysqld restart
[root@prdnagxi03 ~]# service mysqld restart
Redirecting to /bin/systemctl restart mysqld.service
Failed to restart mysqld.service: Unit not found.
Re: ssh check works by command line but not in nagios gui
What OS and release is the XI server running on?
What version of XI is the server running?
Try running
What version of XI is the server running?
Try running
Code: Select all
service mariadb restartBe sure to check out our Knowledgebase for helpful articles and solutions!
Re: ssh check works by command line but not in nagios gui
[root@lvsprdnagxi03 ~]# cat /etc/redhat-release
CentOS Linux release 7.7.1908 (Core)
Your Nagios XI installation is up to date.
Latest Available Version: 5.6.14
Installed Version: 5.6.14
Last Update Check: 2020-05-01 08:08:02
I also rebooted the machine
CentOS Linux release 7.7.1908 (Core)
Your Nagios XI installation is up to date.
Latest Available Version: 5.6.14
Installed Version: 5.6.14
Last Update Check: 2020-05-01 08:08:02
I also rebooted the machine
Re: ssh check works by command line but not in nagios gui
the check from status.dat
You do not have the required permissions to view the files attached to this post.