Passive Check Freshness Not Working

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
jeremie.grund
Posts: 10
Joined: Mon Nov 09, 2015 1:41 pm

Re: Passive Check Freshness Not Working

Post by jeremie.grund »

I couldn't get the grep command to return anything back but I did manage to find this in the objects.cache

Code: Select all

define service {
        host_name       PSVMDB07
        service_description     DYNAMICS Mirror
        check_period    24x7
        check_command   check_dummy!2!"Service has not checked in"
        contact_groups  dbateam
        notification_period     24x7_except_sql_maint
        initial_state   o
        check_interval  10.000000
        retry_interval  2.000000
        max_check_attempts      3
        is_volatile     0
        parallelize_check       1
        active_checks_enabled   0
        passive_checks_enabled  1
        obsess_over_service     1
        event_handler_enabled   1
        low_flap_threshold      0.000000
        high_flap_threshold     0.000000
        flap_detection_enabled  1
        flap_detection_options  o,w,u,c
        freshness_threshold     600
        check_freshness 1
        notification_options    u,w,c,r
        notifications_enabled   1
        notification_interval   60.000000
        first_notification_delay        0.000000
        stalking_options        n
        process_perf_data       1
        failure_prediction_enabled      1
        retain_status_information       1
        retain_nonstatus_information    1
        }
That would look as though active_checks should be disabled... dang another one I was hoping would fix it :)
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: Passive Check Freshness Not Working

Post by jdalrymple »

Why are we obsessing and what is your obsessive command? I don't know how/why it would matter in this case, especially almost 2 minutes after a passive result. Maybe best if you can just post your nagios.cfg in its entirety for me to review.

I'm also curious about max_check_attempts... in my mind this shouldn't be working at all, but there may be logic I am not privy to for freshness alerts.

Can you dig into nagios.log around the timeperiod that this happened and post all the events between the 2 OK alerts?

-- Edit --
jeremie.grund wrote:This was working fine until this weekend, the database server was patched and now even though the agent job runs every minute Nagios doesn't seem to be waiting the 10 minutes before saying that the service hasn't checked in.
I reread and saw this (sorry for missing it earlier) - this makes no sense. I can't fathom how one could affect the other. Tell us more about your passive check.
jeremie.grund
Posts: 10
Joined: Mon Nov 09, 2015 1:41 pm

Re: Passive Check Freshness Not Working

Post by jeremie.grund »

I'm not sure about the obsessing settings to be honest. I'd have to ask why they are set that way.

Here is the bit of the log after restarting nagios

Code: Select all

[1447277094] Finished daemonizing... (New PID=28066)
[1447277105] PASSIVE SERVICE CHECK: PSVMDB07;DYNAMICS Mirror;0;OK: Mirror OK
[1447277105] PASSIVE SERVICE CHECK: PSVMDB07;DYNCUSTOM Mirror;0;OK: Mirror OK
[1447277105] PASSIVE SERVICE CHECK: PSVMDB07;ELIM Mirror;0;OK: Mirror OK
[1447277105] PASSIVE SERVICE CHECK: PSVMDB07;NAI Mirror;0;OK: Mirror OK
[1447277105] SERVICE ALERT: PSVMDB07;DYNAMICS Mirror;CRITICAL;SOFT;1;CRITICAL: Service has not checked in
[1447277115] PASSIVE SERVICE CHECK: PSVMDB07;PARTS Mirror;0;OK: Mirror OK
[1447277115] PASSIVE SERVICE CHECK: PSVMDB07;OnBase Mirror;0;OK: Mirror OK
[1447277115] PASSIVE SERVICE CHECK: PSVMDB07;WORKFLOWS Mirror;0;OK: Mirror OK
[1447277115] PASSIVE SERVICE CHECK: PSVMDB07;UMBRACO Mirror;0;OK: Mirror OK
[1447277165] PASSIVE SERVICE CHECK: PSVMDB07;DYNCUSTOM Mirror;0;OK: Mirror OK
[1447277165] PASSIVE SERVICE CHECK: PSVMDB07;ELIM Mirror;0;OK: Mirror OK
[1447277165] PASSIVE SERVICE CHECK: PSVMDB07;NAI Mirror;0;OK: Mirror OK
[1447277165] PASSIVE SERVICE CHECK: PSVMDB07;DYNAMICS Mirror;0;OK: Mirror OK
[1447277165] SERVICE ALERT: PSVMDB07;DYNAMICS Mirror;OK;SOFT;2;OK: Mirror OK
[1447277175] PASSIVE SERVICE CHECK: PSVMDB07;WORKFLOWS Mirror;0;OK: Mirror OK
[1447277175] PASSIVE SERVICE CHECK: PSVMDB07;OnBase Mirror;0;OK: Mirror OK
[1447277175] PASSIVE SERVICE CHECK: PSVMDB07;UMBRACO Mirror;0;OK: Mirror OK
[1447277175] PASSIVE SERVICE CHECK: PSVMDB07;PARTS Mirror;0;OK: Mirror OK
[1447277215] SERVICE ALERT: PSVMDB07;DYNAMICS Mirror;CRITICAL;SOFT;1;CRITICAL: Service has not checked in
[1447277225] PASSIVE SERVICE CHECK: PSVMDB07;ELIM Mirror;0;OK: Mirror OK
[1447277225] PASSIVE SERVICE CHECK: PSVMDB07;DYNCUSTOM Mirror;0;OK: Mirror OK
[1447277225] PASSIVE SERVICE CHECK: PSVMDB07;DYNAMICS Mirror;0;OK: Mirror OK
[1447277225] SERVICE ALERT: PSVMDB07;DYNAMICS Mirror;OK;SOFT;2;OK: Mirror OK
[1447277235] PASSIVE SERVICE CHECK: PSVMDB07;WORKFLOWS Mirror;0;OK: Mirror OK
[1447277235] PASSIVE SERVICE CHECK: PSVMDB07;OnBase Mirror;0;OK: Mirror OK
[1447277235] PASSIVE SERVICE CHECK: PSVMDB07;NAI Mirror;0;OK: Mirror OK
[1447277235] PASSIVE SERVICE CHECK: PSVMDB07;PARTS Mirror;0;OK: Mirror OK
[1447277235] PASSIVE SERVICE CHECK: PSVMDB07;UMBRACO Mirror;0;OK: Mirror OK
The nagios config is attached. Interesting to note, I've got several passive services defined in this manner on this host.

Code: Select all

define service{
	use			passive-service
	host_name		PSVMDB07
	service_description	DYNAMICS Mirror
	freshness_threshold	600
	notification_period	24x7_except_sql_maint
	}

define service{
	use			passive-service
	host_name		PSVMDB07
	service_description	DYNCUSTOM Mirror
	freshness_threshold	600
	notification_period	24x7_except_sql_maint
	}

define service{
	use			passive-service
	host_name		PSVMDB07
	service_description	ELIM Mirror
	freshness_threshold	600
	notification_period	24x7_except_sql_maint
	}

define service{
	use			passive-service
	host_name		PSVMDB07
	service_description	NAI Mirror
	freshness_threshold	600
	notification_period	24x7_except_sql_maint
	}

define service{
	use			passive-service
	host_name		PSVMDB07
	service_description	OnBase Mirror
	freshness_threshold	600
	notification_period	24x7_except_sql_maint
	}

define service{
	use			passive-service
	host_name		PSVMDB07
	service_description	PARTS Mirror
	freshness_threshold	600
	notification_period	24x7_except_sql_maint
	}

define service{
	use			passive-service
	host_name		PSVMDB07
	service_description	UMBRACO Mirror
	freshness_threshold	600
	notification_period	24x7_except_sql_maint
	}

define service{
	use			passive-service
	host_name		PSVMDB07
	service_description	WORKFLOWS Mirror
	freshness_threshold	600
	notification_period	24x7_except_sql_maint
	}
While troubleshooting we commented out all of the mirrors except the "DYNAMICS" mirror. To get a better log I uncommented those services and the service alert only appears to fire on the "DYNAMICS" mirror service but not the others which are configured in the same way.

Also, can the host group that the host is in have any effect on this? Now that I think about it the host was in a group named w2k8r2-servers and I moved it to w2k12-servers.
Attachments
nagios.cfg
Nagios config
(42.7 KiB) Downloaded 400 times
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: Passive Check Freshness Not Working

Post by jdalrymple »

jeremie.grund wrote:Also, can the host group that the host is in have any effect on this? Now that I think about it the host was in a group named w2k8r2-servers and I moved it to w2k12-servers.
Not really - not as long as active checks are indeed disabled, which they are. I would still try to put it back and see if that has any effect.

I feel like we have come across some bug, but it's all very weird to me.

2 questions

1) I noticed you have debugging enabled. Are you seeing anything interesting in the debug log?
2) You guys aren't writing anything directly to the cmd file are you?
jeremie.grund
Posts: 10
Joined: Mon Nov 09, 2015 1:41 pm

Re: Passive Check Freshness Not Working

Post by jeremie.grund »

No we shouldn't have anything writing directly to the cmd file, I've tried really hard to keep that from being a solution.

As for debugging, can you say where the debug log would usually be?

We've also thought it could be an issue since we are using a rather old version of nagios. But thanks for the help you've provided.
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: Passive Check Freshness Not Working

Post by jdalrymple »

It's in your main nagios.cfg file - note that it's pretty limited in size so it probably rotates very frequently.

Code: Select all

debug_file=/var/lib/nagios/nagios.debug
jeremie.grund
Posts: 10
Joined: Mon Nov 09, 2015 1:41 pm

Re: Passive Check Freshness Not Working

Post by jeremie.grund »

Looks like you may have found something, in the nagios.debug I found this

[1447363495.154184] [016.1] [pid=28066] HOST: PSVMDB07, SERVICE: DYNAMICS Mirror, CHECK TYPE: Active, OPTIONS: 0, SCHEDULED: Yes, RESCHEDULE: Yes, EXITED OK: Yes, RETURN CODE: 2, OUTPUT: CRITICAL: Service has not checked in\n
[1447363495.160192] [016.0] [pid=28066] Scheduling a non-forced, active check of service 'DYNAMICS Mirror' on host 'PSVMDB07' @ Thu Nov 12 16:26:46 2015
[1447363495.160279] [016.1] [pid=28066] Checking service 'DYNAMICS Mirror' on host 'PSVMDB07' for flapping...
[1447363515.076986] [016.1] [pid=28066] Handling check result for service 'OnBase Mirror' on host 'PSVMDB07'...
[1447363515.077011] [016.0] [pid=28066] ** Handling check result for service 'OnBase Mirror' on host 'PSVMDB07'...
[1447363515.077046] [016.1] [pid=28066] HOST: PSVMDB07, SERVICE: OnBase Mirror, CHECK TYPE: Passive, OPTIONS: 0, SCHEDULED: No, RESCHEDULE: No, EXITED OK: Yes, RETURN CODE:
0, OUTPUT: OK: Mirror OK\n
[1447363515.077176] [016.1] [pid=28066] Checking service 'OnBase Mirror' on host 'PSVMDB07' for flapping...
[1447363515.077233] [016.1] [pid=28066] Handling check result for service 'UMBRACO Mirror' on host 'PSVMDB07'...
[1447363515.077242] [016.0] [pid=28066] ** Handling check result for service 'UMBRACO Mirror' on host 'PSVMDB07'...
[1447363515.077247] [016.1] [pid=28066] HOST: PSVMDB07, SERVICE: UMBRACO Mirror, CHECK TYPE: Passive, OPTIONS: 0, SCHEDULED: No, RESCHEDULE: No, EXITED OK: Yes, RETURN CODE: 0, OUTPUT: OK: Mirror OK\n

Note that the DYNAMICS mirror says it's scheduled while the other mirror checks say scheduled: No. This is also visible on the services detail page, the service does not have the PASV icon on the service. All of the mirror services should be using the passive-service template which states that active checks are disabled. What would cause 1 out of the 8 checks to think it's not passive?
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: Passive Check Freshness Not Working

Post by jdalrymple »

I may have found an oddity, but it's not at all clear why the debug log is exhibiting what we're seeing.

You've baffled me entirely, part of me wants to think bug, but I have to wonder why a bug would only pick on one service, defined identically to others.

Things to try:
1) as mentioned, unchange the hostgroup change you made earlier.
2) this would affect all services, so I can't understand how it could have a useful effect, but the obsess over and predictive directives I don't really like:

Code: Select all

        failure_prediction_enabled      1
        obsess_over_service     1
3) compare objects.cache entry for a well-behaved service vs. your poorly behaved service
4) try to create an all new service, explicitly defining all the settings instead of using a template, I also recommend changing the service name (which would require adjustment to the passive sender)

None of these are necessarily solutions, but using these methods you might narrow down further on where the problem actually lies.
jeremie.grund
Posts: 10
Joined: Mon Nov 09, 2015 1:41 pm

Re: Passive Check Freshness Not Working

Post by jeremie.grund »

I commented out the misbehaving service and also changed the failure prediction and obsess settings to 0. I then restart nagios.

I commented the misbehaving service back in and restart nagios again and it seems to be behaving as normal.

I'm sure being on such an old version of nagios isn't helping in these instances but I'm glad I was able to learn more about how to troubleshoot it.

Thanks so much again for your help!
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Passive Check Freshness Not Working

Post by rkennedy »

I'm glad we could help! As this is resolved now, I am going to close this thread out. If you ever need assistance in the future, feel free to open a new thread.
Former Nagios Employee
Locked