Host's current attempt goes to 1 when in hard state

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
User avatar
niebais
Posts: 349
Joined: Tue Apr 13, 2010 2:15 pm

Host's current attempt goes to 1 when in hard state

Post by niebais »

We've found that when a host goes into a hard state that the current_attempt will change to 1 just after going into a hard state. Can you confirm that this is a bug? Is there a setting that triggers this behavior? It doesn't happen for services. This happens in nagios3 and nagios4.

The example below shows the host's attempts increasing. Once the it hits the hard state, the current_attempts goes to 1 after the next check.

Command used to view the state:

Code: Select all

[brianc@xi1 ~]$ echo -e "GET hosts\nColumns: host_name current_attempt max_check_attempts state state_type hard_state\n" | /usr/local/bin/unixcat 
The outputs:

Code: Select all

child1;1;5;1;0;0
child1;2;5;1;0;0
child1;3;5;1;0;0
child1;4;5;1;0;0
child1;5;5;1;1;1
child1;1;5;1;1;1
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Host's current attempt goes to 1 when in hard state

Post by tmcdonald »

That looks like expected behavior according to this doc:

http://nagios.sourceforge.net/docs/3_0/statetypes.html
Former Nagios employee
User avatar
niebais
Posts: 349
Joined: Tue Apr 13, 2010 2:15 pm

Re: Host's current attempt goes to 1 when in hard state

Post by niebais »

Interesting. That table seems to line up with what I see for hosts but not for services.

ex.

Command:

Code: Select all

[root@xi1 brianc]# echo -e "GET services\nColumns: host_name service_description current_attempt max_check_attempts state state_type\n" | /usr/local/bin/unixcat /usr/local/nagios/var/rw/live | grep "^child2;depend1"
Results:

Code: Select all

child2;depend1;1;2;2;0
child2;depend1;2;2;2;1
child2;depend1;2;2;2;1
child2;depend1;2;2;2;1
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Host's current attempt goes to 1 when in hard state

Post by slansing »

Is the above behavior happening when the service's dependent host is in a down state? If so, this may offer insight:

As always, there are exceptions to the rules. When a service check results in a non-OK state, Nagios will check the host that the service is associated with to determine whether or not is up (see the note below for info on how this is done). If the host is not up (i.e. it is either down or unreachable), Nagios will immediately put the service into a hard non-OK state and it will reset the current attempt number to 1. Since the service is in a hard non-OK state, the service check will be rescheduled at the normal frequency specified by the check_interval option instead of the retry_interval option.


Is this happening across the board? What happens if you submit a passive up state to one of the hosts showing this behavior on it's service's and then disable active checking on that host to keep it locked in that state?
User avatar
niebais
Posts: 349
Joined: Tue Apr 13, 2010 2:15 pm

Re: Host's current attempt goes to 1 when in hard state

Post by niebais »

The parent is in an UP state.

I turned off active checks on the host and submitted a passive UP to the host and I get the same behavior on the services.

Do you see this on your side? I see this on multiple instances of nagios -- there isn't a system where I haven't seen this behavior. I prefer the behavior of the services where the current attempt stays at the max attempts when it goes into a hard state. I want to know the reasoning for the the hosts current attempt going to 1. It's not consistent with services which is why we and our customers have noticed it.

Do you know where about in the code this is happening -- looking for a starting point/hint? I can debug it and try to get some more information.

Thanks for your help!
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Host's current attempt goes to 1 when in hard state

Post by scottwilkerson »

I don't have livestatus installed but do see the same in the UI.

Host
host.PNG
Service
service.PNG
To be honest, I've never noticed this before, and don't know that I would have if you didn't mention it.

I can file a bug report and have the Core developers take a look at it.
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
User avatar
niebais
Posts: 349
Joined: Tue Apr 13, 2010 2:15 pm

Re: Host's current attempt goes to 1 when in hard state

Post by niebais »

Cool. I originally saw it in the UI/XI, but was using livestatus for the report.

We discovered this by looking at the Hosts Details pages and wondering why the hosts were critical with an attempt of 1/5. We thought something was up until we saw that the state history was correct. I know that that current attempt doesn't always correlate to a hard state (ex. a service's host is down or dependencies) but this one seemed off.

Thanks for submitting the bug and looking into this.
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Host's current attempt goes to 1 when in hard state

Post by slansing »

We'll try to get in contact when it is fixed, or when we have additional information, this thread's address should be in the the bug report.
Locked