Event handler VS number of runs

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
dlukinski
Posts: 1130
Joined: Tue Oct 06, 2015 9:42 am

Event handler VS number of runs

Post by dlukinski »

Hello XI support

Just like with support for downtime (my other ticket), other key functionality for Event Handling would be number of time script runs (counted and stopped after that, alerting this things are REALLY wrong and trying to restart some service or daemon may do more harm after all).

Is there such functionality we can configure?

Thank you
bwallace
Posts: 1145
Joined: Tue Nov 17, 2015 1:57 pm

Re: Event handler VS number of runs

Post by bwallace »

Are you looking to just count how many times an event handler script runs, or are you saying you have a particular event is running to often? If the latter, Event handlers are called whenever a state change occurs. This includes HARD and SOFT state types, as well as OK, WARNING, CRITICAL, and UNKNOWN states.

https://assets.nagios.com/downloads/nag ... ios-XI.pdf
Be sure to check out the Knowledgebase for helpful articles and solutions!
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Event handler VS number of runs

Post by rkennedy »

key functionality for Event Handling would be number of time script runs (counted and stopped after that, alerting this things are REALLY wrong and trying to restart some service or daemon may do more harm after all).
Event handlers should be able to count, but you'd need to build it in with your assigned plugin. I don't think this is on the Nagios side, as much as it is on the plugin side.

For example, say you have test.sh that is a global handler, you could use the variables Nagios allows you to pass to an event handler, and create a file called /tmp/$HOSTADDRESS$.txt and append +1 every time it's ran. You could also do so using a DB using dynamic variables. The sky is the limit honestly.
Former Nagios Employee
dlukinski
Posts: 1130
Joined: Tue Oct 06, 2015 9:42 am

Re: Event handler VS number of runs

Post by dlukinski »

rkennedy wrote:
key functionality for Event Handling would be number of time script runs (counted and stopped after that, alerting this things are REALLY wrong and trying to restart some service or daemon may do more harm after all).
Event handlers should be able to count, but you'd need to build it in with your assigned plugin. I don't think this is on the Nagios side, as much as it is on the plugin side.

For example, say you have test.sh that is a global handler, you could use the variables Nagios allows you to pass to an event handler, and create a file called /tmp/$HOSTADDRESS$.txt and append +1 every time it's ran. You could also do so using a DB using dynamic variables. The sky is the limit honestly.
Not sure if I understood correctly.

What i mean is that say we monitor host for service A. Service A went down and was restarted 10 times OR or kept restarting for 15 minutes and still down.
at this point we want to stop restarting and create an alert instead. How this could be done with Nagios?
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Event handler VS number of runs

Post by rkennedy »

This would need logic built in to your event handler, as Nagios does not have something like this built in.
Former Nagios Employee
dlukinski
Posts: 1130
Joined: Tue Oct 06, 2015 9:42 am

Re: Event handler VS number of runs

Post by dlukinski »

rkennedy wrote:This would need logic built in to your event handler, as Nagios does not have something like this built in.
Meant built into restart script?
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Event handler VS number of runs

Post by rkennedy »

Yes, it would need to be built in to your event handler scripts. As I mentioned above, you could use variables to create dynamic variables representing a count of some sort.
Former Nagios Employee
dlukinski
Posts: 1130
Joined: Tue Oct 06, 2015 9:42 am

Re: Event handler VS number of runs

Post by dlukinski »

rkennedy wrote:Yes, it would need to be built in to your event handler scripts. As I mentioned above, you could use variables to create dynamic variables representing a count of some sort.

Thank you. Please hep with sample scripts (if any you've got) - in other monitoring systems this would be more or less standard functionality.

Otherwise please close this thread
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Event handler VS number of runs

Post by rkennedy »

Sure, a rough example is something like this, where you would pass $HOSTNAME$ and $SERVICEDISPLAYNAME$ to your event handler, and it then creates a file to use as a 'count'. This would take a lot more development to get working as you need, but this is the basis:

Code: Select all

#!/bin/bash
HOSTNAME=$1
SERVICEDISPLAYNAME=$2
FILENAME=$HOSTNAME.$SERVICEDISPLAYNAME

COUNT=$(cat /tmp/$FILENAME)
COUNTTOT=$(expr $COUNT + 1)

echo Current runs is $COUNTTOT
echo $COUNTTOT > /tmp/$FILENAME
If it's for a host, then the file will be created as hostname., and if it's a service, it would be hostname.servicename. Every time the script runs it will add +1 to the count it gets from the flat files. You would still need to add in logic to see what the state is, and if it should reset, etc.
Former Nagios Employee
dlukinski
Posts: 1130
Joined: Tue Oct 06, 2015 9:42 am

Re: Event handler VS number of runs

Post by dlukinski »

rkennedy wrote:Sure, a rough example is something like this, where you would pass $HOSTNAME$ and $SERVICEDISPLAYNAME$ to your event handler, and it then creates a file to use as a 'count'. This would take a lot more development to get working as you need, but this is the basis:

Code: Select all

#!/bin/bash
HOSTNAME=$1
SERVICEDISPLAYNAME=$2
FILENAME=$HOSTNAME.$SERVICEDISPLAYNAME

COUNT=$(cat /tmp/$FILENAME)
COUNTTOT=$(expr $COUNT + 1)

echo Current runs is $COUNTTOT
echo $COUNTTOT > /tmp/$FILENAME
If it's for a host, then the file will be created as hostname., and if it's a service, it would be hostname.servicename. Every time the script runs it will add +1 to the count it gets from the flat files. You would still need to add in logic to see what the state is, and if it should reset, etc.

Thank you, please close this thread
Locked