Page 2 of 2

Re: Nagios 3.5 - auto acknowledge almost works... almost

Posted: Mon Jan 06, 2014 1:04 pm
by JoeGeorge
Here they are...

under /home/nagios:
-rw-r--r-- 1 nagios nagcmd 681 Jan 6 08:34 .procmailrc

The above source file works as it kicks in the script below correctly.

-rwxr-xr-x 1 nagios nagios 6834 Jan 6 08:41 /usr/local/nagios/etc/scripts/nagios-ack-by-email-dev.pl
( i use the dev, so I can change if I have to. I keep a copy of the real pearl script in a safe place)

And this:
prw-rw---- 1 nagios nagcmd 0 Jan 6 08:41 /usr/local/nagios/var/rw/nagios.cmd




Now that you're talking about ownership, does it look OK/ correct?

thank you

Re: Nagios 3.5 - auto acknowledge almost works... almost

Posted: Mon Jan 06, 2014 1:28 pm
by sreinhardt
thats who owns it, and yes it looks correct, but I was refering to the user that actually executes that script, is it done via procmail, nagios, or another user?

Re: Nagios 3.5 - auto acknowledge almost works... almost

Posted: Mon Jan 06, 2014 1:38 pm
by JoeGeorge
Thank you Spencer...

I hope that I can make it clear. Bear with me.
The user who executes it, I am not sure. I believer it's "nagios".
Reason I say that is, here is what happens:

1. An email alert goes out to several recipients (including myself). As this is a test, only I reply.
2. After I receive the email, i send a reply (without adding any other word or character in the email content)
3. the server receives my reply, passes it on to ".procmailrc", then processes it using the pearl script listed in .procmailrc.
4. supposedly, nagios then acknowledge the alert and show it on the nagios monitoring web console


Does that answer your question?
if you have some debugging tip, please do let me know.

Edit:
Just confirmed that "nagios" is the user executing the script:
nagios 43558 1 0 10:43 ? 00:00:00 /usr/bin/perl -w /usr/local/nagios/etc/scripts/nagios-ack-by-email-dev.pl Re: ** PROBLEM Service Alert: prod ETS nagios/fs root is CRITICAL **
nagios 43558 1 0 10:43 ? 00:00:00 /usr/bin/perl -w /usr/local/nagios/etc/scripts/nagios-ack-by-email-dev.pl Re: ** PROBLEM Service Alert: prod ETS nagios/fs root is CRITICAL **
nagios 43558 1 0 10:43 ? 00:00:00 /usr/bin/perl -w /usr/local/nagios/etc/scripts/nagios-ack-by-email-dev.pl Re: ** PROBLEM Service Alert: prod ETS nagios/fs root is CRITICAL **
nagios 43558 1 0 10:43 ? 00:00:00 /usr/bin/perl -w /usr/local/nagios/etc/scripts/nagios-ack-by-email-dev.pl Re: ** PROBLEM Service Alert: prod ETS nagios/fs root is CRITICAL **
nagios 43558 1 0 10:43 ? 00:00:00 /usr/bin/perl -w /usr/local/nagios/etc/scripts/nagios-ack-by-email-dev.pl Re: ** PROBLEM Service Alert: prod ETS nagios/fs root is CRITICAL **

===========


thank you

Re: Nagios 3.5 - auto acknowledge almost works... almost

Posted: Mon Jan 06, 2014 4:25 pm
by sreinhardt
Well thats good to see, and I verified that your output to the cmd pipe should be correct. Can you tail the nagios.cmd file while this script gets run. Also maybe I missed it, but have you tailed the nagios log or /var/log/messages yet as this is being run?

Code: Select all

tail -f /usr/local/nagios/var/rw/nagios.cmd
tail -f /usr/local/nagios/var/nagios.log
tail -f /var/log/messages

Re: Nagios 3.5 - auto acknowledge almost works... almost

Posted: Mon Jan 06, 2014 5:00 pm
by JoeGeorge
Here they go, after I reply to an email alert and wait for the script to kick in...


(EMPTY, nothing)
[nagios@cdcxvr1269 rw]$ tail -f nagios.cmd







tail -f /usr/local/nagios

=================== nagios.log ============
[1389044635] SERVICE ALERT: cdcxvr1269;fs root;CRITICAL;SOFT;1;DISK CRITICAL - free space: / 7505 MB (78% inode=87%):
[1389044695] SERVICE ALERT: cdcxvr1269;fs root;CRITICAL;SOFT;2;DISK CRITICAL - free space: / 7505 MB (78% inode=87%):
[1389044755] SERVICE ALERT: cdcxvr1269;fs root;CRITICAL;SOFT;3;DISK CRITICAL - free space: / 7505 MB (78% inode=87%):
[1389044815] SERVICE ALERT: cdcxvr1269;fs root;CRITICAL;HARD;4;DISK CRITICAL - free space: / 7505 MB (78% inode=87%):
[1389044815] SERVICE NOTIFICATION: user1;cdcxvr1269;fs root;CRITICAL;notify-service-by-email;DISK CRITICAL - free space: / 7505 MB (78% inode=87%):
[1389044815] SERVICE NOTIFICATION: user2;cdcxvr1269;fs root;CRITICAL;notify-service-by-email;DISK CRITICAL - free space: / 7505 MB (78% inode=87%):
[1389044815] SERVICE NOTIFICATION: nagiosadmin;cdcxvr1269;fs root;CRITICAL;notify-service-by-email;DISK CRITICAL - free space: / 7505 MB (78% inode=87%):
[1389045175] SERVICE ALERT: ftp-etsprd;Check FTP Service;WARNING;SOFT;1;FTP WARNING - 1.232 second response time on port 21 [220 Service ready for new user.]
[1389045235] SERVICE ALERT: ftp-etsprd;Check FTP Service;OK;SOFT;2;FTP OK - 0.007 second response time on port 21 [220 Service ready for new user.]





tail -f /var/log/messages


==================== messages.log
[1389044635] SERVICE ALERT: cdcxvr1269;fs root;CRITICAL;SOFT;1;DISK CRITICAL - free space: / 7505 MB (78% inode=87%):
[1389044695] SERVICE ALERT: cdcxvr1269;fs root;CRITICAL;SOFT;2;DISK CRITICAL - free space: / 7505 MB (78% inode=87%):
[1389044755] SERVICE ALERT: cdcxvr1269;fs root;CRITICAL;SOFT;3;DISK CRITICAL - free space: / 7505 MB (78% inode=87%):
[1389044815] SERVICE ALERT: cdcxvr1269;fs root;CRITICAL;HARD;4;DISK CRITICAL - free space: / 7505 MB (78% inode=87%):
[1389044815] SERVICE NOTIFICATION: user1;cdcxvr1269;fs root;CRITICAL;notify-service-by-email;DISK CRITICAL - free space: / 7505 MB (78% inode=87%):
[1389044815] SERVICE NOTIFICATION: user2;cdcxvr1269;fs root;CRITICAL;notify-service-by-email;DISK CRITICAL - free space: / 7505 MB (78% inode=87%):
[1389044815] SERVICE NOTIFICATION: nagiosadmin;cdcxvr1269;fs root;CRITICAL;notify-service-by-email;DISK CRITICAL - free space: / 7505 MB (78% inode=87%):
[1389045175] SERVICE ALERT: ftp-etsprd;Check FTP Service;WARNING;SOFT;1;FTP WARNING - 1.232 second response time on port 21 [220 Service ready for new user.]

Re: Nagios 3.5 - auto acknowledge almost works... almost

Posted: Mon Jan 06, 2014 7:38 pm
by JoeGeorge
SOLVED:

I had to abandon the perl script.

Big CREDIT to jwalton from here - http://www.opsview.com/forum/opsview-co ... tification

I use his 2 scripts.
They work right away.

To Scott and Spencer from this forum, I truly appreciate your help.
In the end, there was something wrong with the perl script which is way beyond me to try to fix.


Cheers

=== in case you want them right away, here they are, jwalton's scripts (modify to fit your environment) ===

.procmail script
=============
LOGFILE=/home/opsview/.procmail.log
MAILDIR=/home/opsview/
VERBOSE=YES
:0
* ^SUBJECT: .*PROBLEM.*
|/home/opsview/myscriptname.sh
=======================================


Verbose is handy for debugging. In short, I have a local mailbox for the From: account on my alert emails. I forward emails with PROBLEM in the subject to my script.

Here is myscriptname.sh. This is not the most elegant way of handling it, but it seems to be working well enough for my specific use case. I'm using nagios external commands since I could never figure out how to get the ackowledgements to work from opsview_rest.



The following script gives me 4 options for handling alerts.

A simple reply with no required additional text acknowledges the service/host issue that is being replied to
Replying with: "maint", schedules downtime for 2 hours which can be changed by editing the date command for $END
Replying with: "notify" and "nonotify" respectively enable/disable notifications for the monitoring server.
You may need to update the location of nagios.cmd in CMDF below, as well as ensure your user this is running under has write access to nagios.cmd

"
#!/bin/bash
START=`/bin/date +%s`
END=`/bin/date +%s --date="2 hours"`
CMDF='/usr/local/nagios/var/rw/nagios.cmd'
MAIL=`/bin/cat /dev/stdin`
HOST=`/bin/echo "$MAIL" | grep Host: | awk -F "Host:" {'print $2'} | sed -e 's/^[ \t]*//' | head -1`
ACKBY=`/bin/echo "$MAIL" | grep From: | awk -F ":" {'print $2'} | sed -e 's/^[ \t]*//'`
USER="system"
SERVICE=`/bin/echo "$MAIL" | grep Service: | awk -F "Service:" {'print $2'} | sed -e 's/^[ \t]*//' | head -1`
MSG="Acknowledged via email by $ACKBY"
MAINT=`/bin/echo "$MAIL" | grep -i maint | wc -l`
NOTIFY=`/bin/echo "$MAIL" | grep -i notify | wc -l`
NONOTIFY=`/bin/echo "$MAIL" | grep -i nonotify | wc -l`

##Check notify
if [ "${NONOTIFY}" -ge "1" ];
then
/bin/echo "[%lu] DISABLE_NOTIFICATIONS" > $CMDF
exit 0
elif [ "${NOTIFY}" -ge "1" ];
then
/bin/echo "[%lu] ENABLE_NOTIFICATIONS" > $CMDF
exit 0
fi

if [ -n ${HOST} ];
then
if [ -n "${SERVICE}" ];
then
if [ "${MAINT}" -ge "1" ];
then
/bin/echo "[%lu]SCHEDULE_SVC_DOWNTIME;"$HOST";"$SERVICE";$START;$END;0;0;7200;"$USER";"$MSG"" > $CMDF
elif [ "${MAINT}" -eq "0" ];
then
/bin/echo "[%lu] ACKNOWLEDGE_SVC_PROBLEM;"$HOST";"$SERVICE";1;1;1;"$USER";"$MSG"" > $CMDF
fi
else

if [ "${MAINT}" -ge "1" ];
then
/bin/echo "[%lu] SCHEDULE_HOST_DOWNTIME;"$HOST";$START;$END;0;0;7200;"$USER";"$MSG"" > $CMDF
elif [ "${MAINT}" -eq "0" ];
then
/bin/echo "[%lu] ACKNOWLEDGE_HOST_PROBLEM;"$HOST";1;1;1;"$USER";"$MSG"" > $CMDF
fi
fi
fi
================================================