Page 1 of 1

Nagios is not using configured retry interval times

Posted: Mon Jul 03, 2017 3:41 pm
by neosecurearg
Hello,

We have an issue with Nagios XI, which is not using configured retry interval times as expected.

For example, for check the host nwsrma1, whe have these values configured:
Check interval 4 min
Retry interval 1 min
Max check attempts 3 attempts

But when we check Nagios logs, on /var/log/messages, we see:
Jul 3 16:35:35 possrv1 nagios: HOST ALERT: nwsrma1;DOWN;SOFT;1;CRITICAL - 10.10.10.190: rta nan, lost 100%
Jul 3 16:36:05 possrv1 nagios: HOST ALERT: nwsrma1;DOWN;SOFT;2;CRITICAL - 10.10.10.190: rta nan, lost 100%
Jul 3 16:36:20 possrv1 nagios: HOST ALERT: nwsrma1;DOWN;HARD;3;CRITICAL - 10.10.10.190: rta nan, lost 100%
Jul 3 16:36:20 possrv1 nagios: HOST NOTIFICATION: Alejandro Guida;nwsrma1;DOWN;notify-host-by-syslog;CRITICAL - 10.10.10.190: rta nan, lost 100%
Jul 3 16:36:20 possrv1 nagios: HOST NOTIFICATION: Alejandro Guida;nwsrma1;DOWN;notify-host-by-email;CRITICAL - 10.10.10.190: rta nan, lost 100%
Jul 3 16:36:26 possrv1 nagios: HOST ALERT: nwsrma1;UP;HARD;3;OK - 10.10.10.190: rta 3.030ms, lost 0%
Jul 3 16:36:26 possrv1 nagios: HOST NOTIFICATION: Alejandro Guida;nwsrma1;UP;notify-host-by-syslog;OK - 10.10.10.190: rta 3.030ms, lost 0%
Jul 3 16:36:26 possrv1 nagios: HOST NOTIFICATION: Alejandro Guida;nwsrma1;UP;notify-host-by-email;OK - 10.10.10.190: rta 3.030ms, lost 0%

As you can see, the first check is at 16:35:35, the retry check is at 16:36:05 (30 seconds later), and the last retry check before nwsrma1 is marked DOWN in HARD STATE is at 16:36:20 (15 seconds later).

Can you explain to us what is happening, or what files/logs do you need to review this issue? Atthached profile.zip of our system.

Thanks in advance.

Regards.

Linux Distribution: CentOS release 6.5, 64 bits
Manual Installed Nagios XI, version 5.4.5

Re: Nagios is not using configured retry interval times

Posted: Wed Jul 05, 2017 9:17 am
by eloyd
Host checks are done as needed unless forced to be done on schedule. It sometimes means that your timing is not what you think it is, if a service that fails, Nagios checks the host to see if the host is down (thus, explaining why the service failed). Examine https://assets.nagios.com/downloads/nag ... hecks.html for more details, but I think you're seeing normal behavior

Re: Nagios is not using configured retry interval times

Posted: Wed Jul 05, 2017 10:30 am
by tmcdonald
Thanks for the assist, @eloyd!