Service dependency problems

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
TSCAdmin
Posts: 155
Joined: Wed Apr 14, 2010 3:00 pm
Location: India

Service dependency problems

Post by TSCAdmin »

I'm having a bit of a problem with how the service dependencies work in Nagios. Essentially, I'm using SNMP to perform things like process checks and disk availability checks. So what I've done is made those services dependent on the SNMP service. Here's an example config from the /usr/local/nagios/etc/servicedependencies.cfg file:

Code: Select all

define servicedependency {
       dependent_host_name                      gb-doc-svb-0060
       dependent_service_description            atd,Cron,Disk Monitor,SSH,VAS,VAS GP
       host_name                                gb-doc-svb-0060
       service_description                      SNMP
       inherits_parent                          1
       execution_failure_criteria               w
       notification_failure_criteria            w,u,c

}
Seems pretty straightforward ... atd, cron, disk monitor, ssh, vas, vasgp all depend on the SNMP service (since all of those services use SNMP to perform the check). What I thought it would be doing is instantiating an active check against the "host_name/service_description" when one of the "dependent_host_name/dependent_service_description" checks failed.

What I'm seeing is it looks like the dependent service is validating against the current state of the parent service. So the dependent service notifications are going through, despite the fact that the parent service really is unresponsive. It just hasn't performed the check yet for the parent service.

To run through a chain of events using a timed example (all services are on 5 minute timers):

Code: Select all

00:00 - Cron OK
00:10 - atd OK
00:15 - SNMP OK
00:20 - Disk Monitor OK
00:30 - SSH OK
00:40 - VAS OK
00:50 - VAS GP OK
04:30 - SNMP service crashes
05:00 - Cron Critical - Notification Sent
05:10 - atd Critical - Notification Sent
05:15 - SNMP Critical - Notification Sent
05:20 - Disk Monitor - Notification Suppressed
05:30 - SSH - Notification Suppressed
05:40 - VAS - Notification Suppressed
05:50 - VAS GP - Notification Suppressed
Am I missing some configuration item that will make the service dependencies behave the way I want? Or is this not something nagios can do at this time?
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Service dependency problems

Post by mguthrie »

I'm going to have to refer this one up a level among our tech team. In the meantime I'll post the link to our Nagios Core documentation, which gets into more details with manual configs and dependencies, and the Core Config Manager is still based on Nagios Core. That might have what you're looking for.

http://nagios.sourceforge.net/docs/3_0/ ... ncies.html

General Nagios Core Documentation:
http://nagios.sourceforge.net/docs/3_0/toc.html
mmestnik
Posts: 972
Joined: Mon Feb 15, 2010 2:23 pm

Re: Service dependency problems

Post by mmestnik »

This could be related to the Nagios Core's FAQ entry about down VS unreachable. However since this AFAIK only applies to hosts it's not usable for services.

I was talking to a few users a while back looking for a way to get Nagios Core to retroactively check parent hosts/services for use as a replacement init. Though this was proven impossible with the current state of things.

What I did was to use hosts instead of services whenever I needed it to disable other alerts. For the life of me I can not remember what service dependencies were for, I think they only effected the Web Interface.
TSCAdmin
Posts: 155
Joined: Wed Apr 14, 2010 3:00 pm
Location: India

Re: Service dependency problems

Post by TSCAdmin »

According to a little one line footnote on the dependencies webpage, the "soft_state_dependencies" option may be what I'm looking for. I'll give it a try and see what happens.
Note: *One important thing to note is that by default, Nagios will use the most current hard state of the service(s) that is/are being depended upon when it does the dependeny checks. If you want Nagios to use the most current state of the services (regardless of whether its a soft or hard state), enable the soft_state_dependencies option.
It looks like this impacts both services and hosts. It would be nice if it were separated.
mmestnik
Posts: 972
Joined: Mon Feb 15, 2010 2:23 pm

Re: Service dependency problems

Post by mmestnik »

It would be even better to be able to specify this on a per relationship basis. This area definitely needs improvement. IMHO Nagios Core should be able to be deployed to do some of the tasks that software like upstart does. That is it should be able to monitor WiFi link status and "reconnect" then reload all the services for both boot and run-time.

It seamed awkward for me to setup Nagios Core to monitor, but then still used another system(init) during boot.
Locked