Page 2 of 2

Re: ssh check works by command line but not in nagios gui

Posted: Fri May 01, 2020 12:38 pm
by tgriep
It looks like the data is not updating and it could be that the retention.dat file needs to be reset fore that service.

Stop the nagios process.
Edit the /usr/local/nagios/var/retention.dat file and remove that service section from the file.
Start the nagios process.

Go in to the GUI and wait for the check to run, does it show the correct status?

Re: ssh check works by command line but not in nagios gui

Posted: Tue May 05, 2020 6:05 am
by jenstar13
I apologize, I did not see that you posted a reply
i tried that and it failed
but I re-looked in the retention.dat and the check was not in there

Re: ssh check works by command line but not in nagios gui

Posted: Tue May 05, 2020 9:33 am
by tgriep
The service not showing up in the retention.dat file is an issue as it should be there.

Is there a chance you can download a System Profile and PM it to me?
To get your system profile. Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and upload it to the forum post or PM it to me.

Re: ssh check works by command line but not in nagios gui

Posted: Tue May 05, 2020 3:25 pm
by tgriep
I received the profile and shared it with the Techs.

The settings for that check look correct so I want you to run the following as root to truncate the objects data.

Code: Select all

systemctl stop nagios
systemctl stop ndo2db
echo 'truncate nagios_objects;' |mysql -t -u root -pnagiosxi nagios
systemctl start ndo2db
systemctl start nagios

Let the system run for 15 minutes and if the GUI still shows the incorrect status, get the following files from the XI server and PM them to me.

Code: Select all

/usr/local/nagios/var/nagios.log
/usr/local/nagios/var/retention.dat
/usr/local/nagios/var/status.dat
Login to the remove server as root, run the following commands and PM me the output.

Code: Select all

df -h
mount

Re: ssh check works by command line but not in nagios gui

Posted: Wed May 06, 2020 12:59 pm
by tgriep
Login to the remote server and change to the nagios user and then run these commands.

Code: Select all

/usr/lib/nagios/plugins/check_disk -V
/usr/lib/nagios/plugins/check_disk -u GB -w 10% -c 5% -p /nfs/ora_log
ls -l /nfs

Re: ssh check works by command line but not in nagios gui

Posted: Wed May 06, 2020 1:06 pm
by jenstar13
[root@xxxprdnagxi03 ~]# su - nagios
Last login: Wed May 6 08:03:56 EDT 2020
[nagios@xxxprdnagxi03 ~]$ ssh xxxprdstldb11.gspt.net
Last login: Tue May 5 10:41:46 2020 from xxxprdnagxi01.gspt.net

[nagios@xxxprdstldb11 ~]$ /usr/lib/nagios/plugins/check_disk -V
check_disk v1848 (nagios-plugins 1.4.11)

[nagios@xxxprdstldb11 ~]$ /usr/lib/nagios/plugins/check_disk -u GB -w 10% -c 5% -p /nfs/ora_log
DISK OK - free space: /nfs/ora_log 9 GB (99% inode=99%);| /nfs/ora_log=0GB;9;9;0;10


[nagios@xxxprdstldb11 ~]$ ls -l /nfs
total 82
drwxr-xr-x 34 oracle dba 995 Nov 14 12:21 ora_exports
drwxr-xr-x 2 root root 4096 Aug 9 2016 ora_exports_new
drwxrwxr-x 16 oracle dba 577 May 20 2019 ora_install
drwxr-xr-x 2 root root 4096 Aug 9 2016 ora_install_new
drwxr-xr-x 3 oracle dba 25 Oct 17 2019 ora_log
drwxr-xr-x 2 root root 4096 Aug 9 2016 ora_log_new

Re: ssh check works by command line but not in nagios gui

Posted: Wed May 06, 2020 4:02 pm
by tgriep
The last thing to try is to delete the retention.dat file so it can be fully rebuilt.

To remove the file, run the following as root

Code: Select all

service nagios stop
rm /usr/local/nagios/var/retention.dat
service nagios start
Warning, doing this will remove all of the Notes, manual Downtime schedules and cause the system to retest all of the hosts and services.

Re: ssh check works by command line but not in nagios gui

Posted: Thu May 07, 2020 5:19 am
by jenstar13
that failed too
I guess it's time we give up on this check

thank you for trying

Re: ssh check works by command line but not in nagios gui

Posted: Thu May 07, 2020 3:17 pm
by tgriep
One more thing, can you PM me screen captures of the Service in the Service Status menu?
Show the full menu and do not cut off the name of the Host and Service Description.
Get captures of the Overview TAB, the Advanced TAB, and the Configure Tab showing the Command if possible?