Page 1 of 2

Difference between reboot en restart of nagios XI ?

Posted: Thu Aug 29, 2013 4:18 am
by DennisPR
I have a bizar issue since this morning on a Nagios XI 2012R2.3 server running on CentOS 6.4
If I reboot the server there are about 100 hosts from a total of 500 that turn grey (like they are pending) and they have the "notifications disabled" icon
Host.PNG
These hosts are not all in one particular host- or service group but are only part of some hostgroups.
If I click apply confuguration in CCM (without changing anything) Nagios XI reloads and all hosts are back to normal again.
If I reboot again I have the same issue ?
What is the diffirence between rebooting and restarting or reloading Nagios XI ?
What am I missing here ?

Re: Difference between reboot en restart of nagios XI ?

Posted: Thu Aug 29, 2013 9:17 am
by slansing
That can be partially caused by database corruption, if you are having to re-sync the database to return them to normal. If you are not using a safe reboot the MySql server could be unexpectedly stopping causing this. Please take a look at our database repair document:

http://assets.nagios.com/downloads/nagi ... tabase.pdf

It could also be just because you are restarting the entire server, or the nagios processes.

When you reboot, you are restarting the entire server or VM. When you restart the nagios service, you are doing exactly that. If that's what you are asking. Do these hosts ever change state? Are they being placed in downtime?

Re: Difference between reboot en restart of nagios XI ?

Posted: Mon Sep 09, 2013 4:59 am
by DennisPR
I have performed a repair of the database as described in http://assets.nagios.com/downloads/nagi ... tabase.pdf
The repair did not return any erorrs.

I did a reboot of the VM after the repair and I still have the same issue.
Some of the hosts come in a pending state and notifications are disabled.

Logging on to the console and performing the following command still solves the issue : service nagios reload

The hosts are not set for downtime.
It also seems that there are no services linked to these hosts anymore after a reboot
1.PNG
After the service nagios reload it looks like this ?
2.PNG
Any more advice pls ?

Re: Difference between reboot en restart of nagios XI ?

Posted: Mon Sep 09, 2013 12:16 pm
by abrist
As some retention files are not preserved on a system reboot, it may take a while for the summaries to update as everything gets scheduled.

You may have a problem with ndo not starting correctly on reboot if a service nagios restart fixes the issues.

Re: Difference between reboot en restart of nagios XI ?

Posted: Wed Sep 11, 2013 3:36 am
by DennisPR
Abrist can you tell me what I need to check after a reboot pls ?

Re: Difference between reboot en restart of nagios XI ?

Posted: Wed Sep 11, 2013 9:38 am
by abrist
After the VM is rebooted, how long have you waited for XI to schedule and start checking?
After a reboot, if things do not start working, check the status of:

Code: Select all

service nagios status
service ndo2db status

Re: Difference between reboot en restart of nagios XI ?

Posted: Thu Sep 26, 2013 10:30 am
by DennisPR
Hi I've retested after a reboot and waited for 15 minutes.

Code: Select all

[root@myhost ~]# uptime
 17:18:07 up 15 min,  1 user,  load average: 0.34, 0.66, 0.55
[root@myhost ~]# service nagios status
nagios (pid 2871) is running...
[root@myhost ~]# service ndo2db status
ndo2db (pid 2890) is running...
[root@ap-dco67-mon ~]#
Here are some more screenshots :
Idle1.PNG
Idle2.PNG
Idle3.PNG

Re: Difference between reboot en restart of nagios XI ?

Posted: Thu Sep 26, 2013 10:31 am
by DennisPR
If I click on "See this host in Nagios core" it looks like this
Idle4.PNG

Re: Difference between reboot en restart of nagios XI ?

Posted: Thu Sep 26, 2013 10:35 am
by abrist
If you restart ndo2db, does the XI frontend start reporting host status and host check information again?

Code: Select all

service ndo2db restart

Re: Difference between reboot en restart of nagios XI ?

Posted: Thu Sep 26, 2013 10:35 am
by sreinhardt
Are you using an offloaded mysql database? Are your nagios configs or perf data on a mounted network or san share? Do you have any other performance changes that may be in place?

Code: Select all

ll /usr/local/nagios/var/ | grep cache
grep -i 'cache' /usr/local/nagios/etc/nagios.cfg