Service Incosistent state

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
amprantino
Posts: 140
Joined: Thu Apr 18, 2013 8:25 am
Location: libexec

Service Incosistent state

Post by amprantino »

Hello,

these are the states appearing for the host

Service definiton:

Code: Select all

define service{
        use                             generic-service
        host_name                       SQL-Server
        service_description             Disk-D-RM
        servicegroups                  XXXXXXXX
        is_volatile                     0
        check_period                    24x7
        max_check_attempts              3
        normal_check_interval           30
        retry_check_interval            1
        flap_detection_enabled          0
        contact_groups                  sys-admins,db-admins
        notification_interval           240
        notification_period             24x7
        notification_options            c,r
        check_command                   check_snmp_storage!XXXXXXXXXXX!160!150!"D:"!-T bl -G
        }
Snap3.png
Snap4.png
On Services > it's flapping.
Inside the service, no flapping

How is this possible?
Any ideas why notification wasn't sent?
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Service Incosistent state

Post by dwhitfield »

I'm not sure what you mean about the notification. You mean you didn't get a critical notification?

As for flapping, you have flapping detection turned off. You have a couple of options.

1) Turn flapping detection on, and force enough checks for it to stop flapping. Once it's not flapping, turn flapping detection off.
2) stop nagios and delete your retention.dat, restart nagios. You lose a lot of information this way, so it doesn't seem like the best option to me

Please let us know if those do not work for you.
amprantino
Posts: 140
Joined: Thu Apr 18, 2013 8:25 am
Location: libexec

Re: Service Incosistent state

Post by amprantino »

Yes, I never received a critical notification.

Code: Select all

        flap_detection_enabled          0
flap_detection_enabled *: This directive is used to determine whether or not flap detection is enabled for this host. More information on flap detection can be found here. Values: 0 = disable host flap detection, 1 = enable host flap detection.

1) Although "flap_detection_enabled = 0" the service is detected as flapping!!! Why ? It should never enter this state!

2) If service is flapping, and "flap_detection_enabled = 0" is configured afterwards, the service isn't allowed to exit the flapping state?


Obviously retention.dat cannot be deleted
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Service Incosistent state

Post by dwhitfield »

amprantino wrote:Although "flap_detection_enabled = 0" the service is detected as flapping!!! Why ? It should never enter this state!
If it is flapping before detection is disabled, it will keep that state.
amprantino wrote: 2) If service is flapping, and "flap_detection_enabled = 0" is configured afterwards, the service isn't allowed to exit the flapping state?
That's correct. Once disabled, it is unable to detect exiting the flapping state.


Even if flap_detection_enabled = 0, if it is in a flapping state before the change, notifications may still be suppressed.

Also, what's the output of ps -aef | grep nagios.cfg?

Please post your objects.cache, status.dat, and retention.dat or PM them if there are security concerns. If you PM, please make sure you update the thread so it comes back up on the support dashboard.
Last edited by dwhitfield on Fri Jun 23, 2017 4:04 pm, edited 1 time in total.
Reason: clarification about what happens in a flapping state, as well as asking for additional info
amprantino
Posts: 140
Joined: Thu Apr 18, 2013 8:25 am
Location: libexec

Re: Service Incosistent state

Post by amprantino »

I have disabled notification during flapping.

Thank you
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Service Incosistent state

Post by dwhitfield »

I edited my last post after discussing the issue with another tech. You may not have seen my edits.

What's the output of ps -aef | grep nagios.cfg?

Please post your objects.cache, status.dat, and retention.dat or PM them if there are security concerns. If you PM, please make sure you update the thread so it comes back up on the support dashboard.

Additionally, what is the current status of the situation? A couple of us were not sure if your last post was saying the issue was resolved or not.
amprantino
Posts: 140
Joined: Thu Apr 18, 2013 8:25 am
Location: libexec

Re: Service Incosistent state

Post by amprantino »

An explanation could be that the service was flapping before someone disabled flap detection.
So the state was flapping; I didn't get a notification because I have disabled service notification during flapping.

I will send you the .dat files tomorrow in a PM
The current state is critical + flapping. (I send you a state update when I send you the dat files)
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Service Incosistent state

Post by dwhitfield »

amprantino wrote:An explanation could be that the service was flapping before someone disabled flap detection.
Yes, that was what I meant to suggest in my first post. Apologies for the confusion.

I still think your quickest resolution is to turn flapping detection on, and force enough checks for it to stop flapping. Once it's not flapping, turn flapping detection off (assuming you don't want it).
amprantino
Posts: 140
Joined: Thu Apr 18, 2013 8:25 am
Location: libexec

Re: Service Incosistent state

Post by amprantino »

Is there a way to find all flapping services that have flap detection off? (=all services stack to flap state)
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Service Incosistent state

Post by dwhitfield »

You can just use grep -R "flap_detection_enabled = 0" in your cfg directory. And then match that with the flapping states in your status.dat.

One thing you could too that was not mentioned was that rather than deleting retention.dat, you could just edit it to remove the flapping state. ***Make sure nagios is off when you do this though.***
Locked