Event handler not working

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Event handler not working

Post by abrist »

BanditBBS wrote:You know, maybe I'm being an idiot and the script isn't really working properly.
Looks like it is working though, at least locally as user 'nagios'. Have you checked the remote systems /var/log/messages for any errors or potential hints? It would be nice to know if the nrpe is ever accessed by the event handler. Try echoing to a temp file again from the clean_wlan_arc.sh script in order to make sure that it is running as well from the event handler.
BanditBBS wrote:Also, can you explain the difference between the 3 examples you gave.
The first is correct, while the other two will try to apply the -w and -c to check_nrpe itself, not the command to be run on the remote system.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
User avatar
BanditBBS
Posts: 2474
Joined: Tue May 31, 2011 12:57 pm
Location: Scio, OH
Contact:

Re: Event handler not working

Post by BanditBBS »

Wow, I LOL'd at "The first is correct" so thank you!

When I did that echo test, it was through the event handler. I did a passive check result of WARNING to kick it off.
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Event handler not working

Post by abrist »

So the echo runs from the event handler, try adding a different echo to the clean_wlan_arc.sh script called by the nrpe script. You may want to change all your echos to redirect ( >> ) to the /tmp file as you will not see any of the echos when run as an event handler anyways.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
User avatar
BanditBBS
Posts: 2474
Joined: Tue May 31, 2011 12:57 pm
Location: Scio, OH
Contact:

Re: Event handler not working

Post by BanditBBS »

abrist wrote:So the echo runs from the event handler, try adding a different echo to the clean_wlan_arc.sh script called by the nrpe script. You may want to change all your echos to redirect ( >> ) to the /tmp file as you will not see any of the echos when run as an event handler anyways.
My one file looks like this now:

Code: Select all

#!/bin/sh
#
# Delete WLAN Archives older than 10 days
# Modified 02/26/2013 LBW
#

echo $(date) "Searching and removing older files" >> /tmp/event_test
/usr/bin/find /var/bu/wlanadmin/*.gpg -mtime +5 -exec sudo /bin/rm {} \;
/usr/bin/find /var/bu/wlanadmin/*.cfg -mtime +5 -exec sudo /bin/rm {} \;

exit
and my event_test file has this:

Code: Select all

Thu Nov 14 12:59:20 EST 2013 WARNING SOFT 1
Thu Nov 14 12:59:20 EST 2013 Searching and removing older files
But the files it should have deleted are still there. Same thing happens when I manually run using this from Nagios server:

Code: Select all

[nagios@svwdcnagios02 libexec]$ ./check_nrpe -H svwdcnetmg02 -t 30 -c clean_drive -a 'WARNING SOFT 1'
NRPE: Unable to read output
Not sure why I am getting the unable to read, but it is running the scripts as the event_test file does get appended to.

EDIT: it has to be a permissions issue or something with the rm command in my script. I'm just lost
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Event handler not working

Post by abrist »

Interesting. Lets add a zero to the exit and a status:

Code: Select all

echo "$1 $2 $3" > /tmp/event_test
/usr/local/nagios/libexec/clean_wlan_arc.sh
echo "completed"
exit $?

Code: Select all

echo $(date) "Searching and removing older files" >> /tmp/event_test
/usr/bin/find /var/bu/wlanadmin/*.gpg -mtime +5 -exec sudo /bin/rm {} \;
/usr/bin/find /var/bu/wlanadmin/*.cfg -mtime +5 -exec sudo /bin/rm {} \;
exit 0
I suspect it is permission related, but you have tested it. I presume you have a sudoer entry for /bin/rm?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
User avatar
BanditBBS
Posts: 2474
Joined: Tue May 31, 2011 12:57 pm
Location: Scio, OH
Contact:

Re: Event handler not working

Post by BanditBBS »

Code: Select all

NAGIOSXI ALL = NOPASSWD:/bin/rm
As yeah, as you said, if I run the script as nagios on that server, it works fine.

I made the suggested changes and still same result.

here is the full clean_drive script for your knowledge:

Code: Select all

#!/bin/sh
#
# What state is the service in?

case "$1" in

OK)
        # The service just came back up, so don't do anything...
        echo -n "Nothing to do"
        ;;

WARNING)
        # We don't really care about warning states, since the service is probably still running...
        echo $(date) "$1 $2 $3" >> /tmp/event_test
        /usr/local/nagios/libexec/clean_wlan_arc.sh
        ;;

UNKNOWN)
        # We don't know what might be causing an unknown error, so don't do anything...
        ;;

CRITICAL)
        # Aha!  The process appears to have a problem - perhaps we should kill the process...
        # Is this a "soft" or a "hard" state?
        echo $(date) "$1 $2 $3" >> /tmp/event_tet
        /usr/local/nagios/libexec/clean_wlan_arc.sh
        ;;

esac
exit $?
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Event handler not working

Post by abrist »

Did you add the "exit 0" to the clean_wlan_arc.sh script? The exit $? should be exiting with the previous exit code from the clean_wlan_arc.sh script. If you did, but you are still not getting a proper exit code from the event handler when run manually, then the clean_wlan_arc.sh script is dying before the exit.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
User avatar
BanditBBS
Posts: 2474
Joined: Tue May 31, 2011 12:57 pm
Location: Scio, OH
Contact:

Re: Event handler not working

Post by BanditBBS »

abrist wrote:Did you add the "exit 0" to the clean_wlan_arc.sh script? The exit $? should be exiting with the previous exit code from the clean_wlan_arc.sh script. If you did, but you are still not getting a proper exit code from the event handler when run manually, then the clean_wlan_arc.sh script is dying before the exit.
Umm, yep. So that means it is dying. Now the question is, why, when it works as I proved in an earlier post that showed me su as nagios and then run the script.
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Event handler not working

Post by abrist »

No idea. This may be best troubleshooted through a remote at this point. Have you looked at the system messages for any pam/sudo permission denied type errors?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
User avatar
BanditBBS
Posts: 2474
Joined: Tue May 31, 2011 12:57 pm
Location: Scio, OH
Contact:

Re: Event handler not working

Post by BanditBBS »

abrist wrote:No idea. This may be best troubleshooted through a remote at this point. Have you looked at the system messages for any pam/sudo permission denied type errors?
I've been avoiding that as that log is HUGE and constantly receiving crap. let me do that now and see if I can spot anything.
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
Locked