reload appears to cause skip of remaining attempts

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
mckslim
Posts: 8
Joined: Mon Jun 17, 2013 11:17 am

reload appears to cause skip of remaining attempts

Post by mckslim »

Running 3.4.1:
I see this strange anomaly, where a host check is in the middle of doing retries before hitting max_attempts, but after a server reload occurs, the next check is automatically forced to DOWN;HARD;1, as seen here:

[2013-06-04 08:40:21] HOST ALERT: 5gt4;DOWN;SOFT;1;CRITICAL: Connection timed out to '' after 160 seconds (user 'chk'). Expected prompt not found. Last output was ''.
[2013-06-04 08:47:18] HOST ALERT: 5gt4;DOWN;SOFT;2;CRITICAL: Connection timed out to '' after 160 seconds (user 'chk'). Expected prompt not found. Last output was ''.
[2013-06-04 08:54:03] HOST ALERT: 5gt4;DOWN;SOFT;3;CRITICAL: Connection timed out to '' after 160 seconds (user 'chk'). Expected prompt not found. Last output was ''.
(reload happens here at 09:00)
[2013-06-04 09:00:52] HOST ALERT: 5gt4;DOWN;HARD;1;CRITICAL: Connection timed out to '' after 160 seconds (user 'chk'). Expected prompt not found. Last output was ''.

Why is it skipping the rest of the attempts and going straight to DOWN;HARD after the reload ?
Seems like a bug to me.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: reload appears to cause skip of remaining attempts

Post by abrist »

Do you have "initial_state" set on the object?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
mckslim
Posts: 8
Joined: Mon Jun 17, 2013 11:17 am

Re: reload appears to cause skip of remaining attempts

Post by mckslim »

'initial_state' is not set to anything
Remember that this is happening on a reload (I haven't examined what happens on a full restart).
thanks
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: reload appears to cause skip of remaining attempts

Post by abrist »

Was this host configured for downtime? I ask because there were a number of bugs related to flexible downtime and hard states.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
mckslim
Posts: 8
Joined: Mon Jun 17, 2013 11:17 am

Re: reload appears to cause skip of remaining attempts

Post by mckslim »

no scheduled downtime has been in effect for this problem
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: reload appears to cause skip of remaining attempts

Post by abrist »

How many check attempts are set on this check?
Do you have more than 1 nagios parent process running?

Code: Select all

ps -aef | grep nagios.cfg
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
mckslim
Posts: 8
Joined: Mon Jun 17, 2013 11:17 am

Re: reload appears to cause skip of remaining attempts

Post by mckslim »

max_attempts is 4

output:
$ ps -aef | grep nagios.cfg
nagios 5726 1 0 Jun14 ? 00:41:14 /opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg
nagios 25239 5726 0 22:58 ? 00:00:00 /opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg
nagios 25330 5726 0 22:58 ? 00:00:00 /opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg
nagios 25436 5726 0 22:59 ? 00:00:00 /opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg
nagios 25444 5726 0 22:59 ? 00:00:00 /opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg
nagios 25446 5726 0 22:59 ? 00:00:00 /opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg
nagios 25448 5726 0 22:59 ? 00:00:00 /opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg
nagios 25465 5726 0 22:59 ? 00:00:00 /opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg
nagios 25492 5726 0 22:59 ? 00:00:00 /opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg
nagios 25533 5726 0 22:59 ? 00:00:00 /opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg
nagios 25544 5726 0 22:59 ? 00:00:00 /opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg
nagios 25563 5726 0 22:59 ? 00:00:00 /opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg
nagios 25615 5726 0 22:59 ? 00:00:00 /opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: reload appears to cause skip of remaining attempts

Post by abrist »

The process list looks fine as they are all children of the same parent process. Do you only experience this issues when restarting nagios?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
mckslim
Posts: 8
Joined: Mon Jun 17, 2013 11:17 am

Re: reload appears to cause skip of remaining attempts

Post by mckslim »

I only find this happening when Nagios is reloaded.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: reload appears to cause skip of remaining attempts

Post by abrist »

I would suggest opening a bug with tracker.nagios.org , but you should probably first think about updating to the newest version and give that a go before you file the bug.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Locked