Page 1 of 3

External Commands Logic.

Posted: Tue Apr 30, 2013 1:22 pm
by samuel
I have two External Commands that can enable and disable hostgroups/services notifications.
I made a misc command in nagios that can use these external commands.

Code: Select all

/usr/local/nagios/libexec/eventhandlers/handle-master-host-event $HOSTSTATE$ $HOSTSTATETYPE$
There are two nagios servers in two different facilities.
They both monitor each other, but only alert on their own facilities and each others router.
Here is a map.

Code: Select all

Naigos1----Router1-----Router2----Nagios2
My goal with these external commands is to enable or disable notifications depending on the state of the routers and nagios servers.

From the view point of the nagios server in Facility 1

Code: Select all

If 
     Naigos1: Up ---- Router1: Up ----- Router2: Up---- Nagios2: Down
then 
     Nagios will enable notifications on Facility 2

Code: Select all

If 
     Naigos1: Up ---- Router1: Up ----- Router2: Down---- Nagios2: Down
then 
     Nagios will disable notifications on Facility 2
From the view point of the nagios server in Facility 2

Code: Select all

If 
     Naigos2: Up ---- Router2: Up ----- Router1: Up---- Nagios1: Down
then 
     Nagios will enable notifications on Facility 1

Code: Select all

If 
     Naigos2: Up ---- Router2: Up ----- Router1: Down---- Nagios1: Down
then 
     Nagios will disable notifications on Facility 1
My logic to set all of this up is, from Nagios1 side, to set up a service on the Router2 to ping Nagios2.
If the service is up, then disable alerts.
If the serivce is down/unknown, then enable alerts.
Then set up the same thing on Nagios2.

So from Nagios1 perspective what would happen when Router2 goes down, as well as Nagios2, would it's Event Handler overrule the services Event Handler?

What version of Nagios XI are you using? Nagios XI 2012R1.8
Linux Distribution and version? centos 6.3
32 or 64bit? 32bit
VMware Image or Manual Install of XI? VMware Image
Are there specials configurations on your system, ie; is Gnome installed? Are you using a proxy? Are you using SSL? none

Re: External Commands Logic.

Posted: Tue Apr 30, 2013 1:48 pm
by abrist
If you have parent-child relationships setup, nagios2 would be considered unreachable in your scenario, ceasing any further service checks on the host due to the parent of nagios2 being the router,. In which case, the event handler for the host would probably be run (unless you have logic set up for "unreachable hosts").

Without those relationships, there is a good chance both handlers would run. I would make sure that you are not using binary (toggle) logic for the handlers as this would cause the notifications to be turned on, and then off again. The script's logic should work off of the hoststate/servicestate macros, so that you can account for when both are in a failed state.

Re: External Commands Logic.

Posted: Wed May 01, 2013 1:50 pm
by samuel
I have parent-child relationships.
It is the dashes between routers and nagios servers.
The logic for the service will be set up to enable notifications when it is down or unknown, and disable alerts when up.
The host will be set up to disable notifications when down, and to do nothing when up.
The scripts will run of the hoststate/hoststatetype and servicestate/servicestatetype.
How would hoststate/servicestate account for when both are in a failed state?

Re: External Commands Logic.

Posted: Wed May 01, 2013 2:37 pm
by abrist
Service checks are stopped when a host is in a down/unreachable state.

Re: External Commands Logic.

Posted: Wed May 01, 2013 4:53 pm
by samuel
I set up this situation and if the host goes down first then the service it enables notifications

Re: External Commands Logic.

Posted: Thu May 02, 2013 9:48 am
by abrist
Does your custom script enable notifications on the services of a 'down' host? This is not nagios' default behavior.

Re: External Commands Logic.

Posted: Mon May 06, 2013 1:51 pm
by samuel
Here is my code for eventhandler for hosts

Code: Select all

#!/bin/sh



# Only take action on hard host states...

case "$2" in

HARD)

	case "$1" in

	DOWN)

		# The router has gone down!

		# We don't want a bunch of notifications

		# So disable notifications...

		/usr/local/nagios/libexec/eventhandlers/disable_facility_notifications 

		;;


	UP)

		# The router will do nothing. 
		
           
		;;

	esac

	;;

esac

exit 0


Here is my code for eventhandler for services

Code: Select all

#!/bin/sh



# Only take action on hard service states...

case "$2" in

HARD)

	case "$1" in

	CRITICAL)

		# The master service has gone down!

		# network, so enable notifications...

		/usr/local/nagios/libexec/eventhandlers/enable_facility_notifications 

		;;

        UNKNOWN)

             	# The master service is UNKNOWN!

                # network, so enable notifications...

                /usr/local/nagios/libexec/eventhandlers/enable_facility_notifications
                
                ;;


	OK)

		# The service has recovered!

		# disable notifications...

		/usr/local/nagios/libexec/eventhandlers/disable_facility_notifications
           
		;;

	esac

	;;

esac

exit 0



Re: External Commands Logic.

Posted: Mon May 06, 2013 2:34 pm
by abrist
The scripts seem fine. In your tests, what was the behavior of the event handler if router 2 goes down? Was there a hints at a race condition?

Re: External Commands Logic.

Posted: Mon May 06, 2013 3:28 pm
by samuel
I have been testing this system on a small scale. I set up a host and put a ping service on it.
Then placed the eventhandlers on it.
I noticed that if the host says it is down first then the service it will enable alerts.

Re: External Commands Logic.

Posted: Mon May 06, 2013 4:48 pm
by abrist
samuel wrote:I noticed that if the host says it is down first then the service it will enable alerts.
Was this not your intention? Could you please clarify the issue?