Hello,
We have an issue with Nagios XI, which is not using configured retry interval times as expected.
For example, for check the host nwsrma1, whe have these values configured:
Check interval 4 min
Retry interval 1 min
Max check attempts 3 attempts
But when we check Nagios logs, on /var/log/messages, we see:
Jul 3 16:35:35 possrv1 nagios: HOST ALERT: nwsrma1;DOWN;SOFT;1;CRITICAL - 10.10.10.190: rta nan, lost 100%
Jul 3 16:36:05 possrv1 nagios: HOST ALERT: nwsrma1;DOWN;SOFT;2;CRITICAL - 10.10.10.190: rta nan, lost 100%
Jul 3 16:36:20 possrv1 nagios: HOST ALERT: nwsrma1;DOWN;HARD;3;CRITICAL - 10.10.10.190: rta nan, lost 100%
Jul 3 16:36:20 possrv1 nagios: HOST NOTIFICATION: Alejandro Guida;nwsrma1;DOWN;notify-host-by-syslog;CRITICAL - 10.10.10.190: rta nan, lost 100%
Jul 3 16:36:20 possrv1 nagios: HOST NOTIFICATION: Alejandro Guida;nwsrma1;DOWN;notify-host-by-email;CRITICAL - 10.10.10.190: rta nan, lost 100%
Jul 3 16:36:26 possrv1 nagios: HOST ALERT: nwsrma1;UP;HARD;3;OK - 10.10.10.190: rta 3.030ms, lost 0%
Jul 3 16:36:26 possrv1 nagios: HOST NOTIFICATION: Alejandro Guida;nwsrma1;UP;notify-host-by-syslog;OK - 10.10.10.190: rta 3.030ms, lost 0%
Jul 3 16:36:26 possrv1 nagios: HOST NOTIFICATION: Alejandro Guida;nwsrma1;UP;notify-host-by-email;OK - 10.10.10.190: rta 3.030ms, lost 0%
As you can see, the first check is at 16:35:35, the retry check is at 16:36:05 (30 seconds later), and the last retry check before nwsrma1 is marked DOWN in HARD STATE is at 16:36:20 (15 seconds later).
Can you explain to us what is happening, or what files/logs do you need to review this issue? Atthached profile.zip of our system.
Thanks in advance.
Regards.
Linux Distribution: CentOS release 6.5, 64 bits
Manual Installed Nagios XI, version 5.4.5
Nagios is not using configured retry interval times
-
neosecurearg
- Posts: 8
- Joined: Wed Mar 09, 2016 8:47 am
Nagios is not using configured retry interval times
You do not have the required permissions to view the files attached to this post.
Re: Nagios is not using configured retry interval times
Host checks are done as needed unless forced to be done on schedule. It sometimes means that your timing is not what you think it is, if a service that fails, Nagios checks the host to see if the host is down (thus, explaining why the service failed). Examine https://assets.nagios.com/downloads/nag ... hecks.html for more details, but I think you're seeing normal behavior
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoyd
I'm a Nagios Fanatic! • Join our public Nagios Discord Server!
Re: Nagios is not using configured retry interval times
Thanks for the assist, @eloyd!
Former Nagios employee