Page 1 of 1

Possible bug with passive checks

Posted: Tue Oct 07, 2014 5:26 am
by liquidcool
Hi,

We run a whole lot of passive checks across the devices we monitor (switches mainly). I recently noticed that a switch that was recently updated, the passive checks were no longer working. What had happened was when the switch was about to be upgraded the engineer disabled all checks for the device and services. When he completed the update he enabled all checks for the device and its services. What this did was also turn all the passive checks into active checks. No matter how many times you restarted the nagios service the checks stayed as active checks. I had to remove the cfg file, restart nagios, put the cfg file back again and once again restart the nagios service. Then the passive checks came back as they should.

If this is not unknown then I apologise, but thought it worth mentioning if it is a bug.

btw we are running 4.0.8

Thanks

Re: Possible bug with passive checks

Posted: Tue Oct 07, 2014 5:03 pm
by slansing
That is odd, so you tried to change them in the web interface? Or the configuration files? I've not seen that before but I will try to reproduce it tomorrow on core 4.0.8.

Re: Possible bug with passive checks

Posted: Wed Oct 08, 2014 3:27 am
by liquidcool
I could not change them in the web interface (Or at least I could not see where to - not to mention there are tonnes of them so it would have been a little tedious doing each one)

The config files were set as Passive checks. This is the template for the passive checks :
define service {
name port-link-state
is_volatile 1
check_command check-host-alive
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
active_checks_enabled 0
passive_checks_enabled 1
check_period 24x7
notification_interval 31536000
notification_period 24x7
notification_options w,u,c,r
notifications_enabled 1
contact_groups networkops
flap_detection_enabled 0
register 0
}

Re: Possible bug with passive checks

Posted: Thu Oct 09, 2014 3:06 pm
by lmiltchev
What had happened was when the switch was about to be upgraded the engineer disabled all checks for the device and services. When he completed the update he enabled all checks for the device and its services.
How exactly did he do this? Can you elaborate? We can try following the same steps in order to recreate the issue in house.
Note: Nagios would not initiate the checks anyway as these are passive checks. It is up to the application on the external device that sends the check results to nagios...

Re: Possible bug with passive checks

Posted: Fri Oct 10, 2014 2:34 am
by liquidcool
So basically this is what he did :
Clicked on the device / host
clicked on "Disable checks of all services on this host"
checked "Disable for host too"
Clicked commit

When the work was completed he did this :
Clicked on the device / host
Clicked on "Enable checks of all services on this host"
checked "Enable for host too"
clicked commit

That was it. All the passive checks started running the check-host-alive check command and were no longer passive checks (became active checks)

Re: Possible bug with passive checks

Posted: Fri Oct 10, 2014 4:38 pm
by sreinhardt
Thanks! I've got some passive checks to implement this weekend, so I'll give this a test and see if I can recreate.

Re: Possible bug with passive checks

Posted: Thu Oct 23, 2014 9:09 am
by liquidcool
Hi,

Has there been any update on this ?

Re: Possible bug with passive checks

Posted: Fri Oct 24, 2014 9:03 am
by sreinhardt
Yes, sorry about that. Let me look over my notes and I will post back.