Page 1 of 1

Debugging Host/Service notifications

Posted: Wed Jul 15, 2020 6:16 am
by acheesem
Hi,

I am currently trying to debug host and service notifications, they're being seen in the notification log and they're escalating to the right people. But the resulting alert is not being received (we are using opsgenie)

I've enabled debug logging to see the output command 'Final Output' and manually run it and it creates an alert as expected, so this leads me to believe there is some permission / environment issue occurring when it attempts to run the command.

I am trying to work out if there is a log or a way to record the notifications as they are sent and their output, so I can identify what is going wrong.

I have noticed in the Notification Log that it is listing the dispatcher as 'Custom: ' with no value after it, which seems odd now

any assistance would be appreciated.

cheers
--Aaron


System Details:
Nagios 5.7.2
Mod_Gearman 2 workers
CentOS Linux release 7.7.1908 64bit

Re: Debugging Host/Service notifications

Posted: Wed Jul 15, 2020 4:52 pm
by benjaminsmith
Hi Aaron,

When you are testing the command manually, make sure you are logged in as the nagios user ( su - nagios ). If it's not working, then you'll need to update the permissions on the notification handler for opsgenie. They should be as follows:

Code: Select all

chown apache:nagios
chmod 775
Otherwise, if that's not it, it would be helpful to review the configurations in the system profile. Thanks, Benjamin

To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message or upload it to the post/ticket, and then reply to this post to bring it up in the queue.

Re: Debugging Host/Service notifications

Posted: Wed Jul 15, 2020 5:12 pm
by acheesem
benjaminsmith wrote:Hi Aaron,

When you are testing the command manually, make sure you are logged in as the nagios user ( su - nagios ). If it's not working, then you'll need to update the permissions on the notification handler for opsgenie. They should be as follows:

Code: Select all

chown apache:nagios
chmod 775
Otherwise, if that's not it, it would be helpful to review the configurations in the system profile. Thanks, Benjamin

To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message or upload it to the post/ticket, and then reply to this post to bring it up in the queue.
Yup tested as nagios user, seems to run ok still.

I'll send through the profile.

thanks!

Re: Debugging Host/Service notifications

Posted: Thu Jul 16, 2020 3:41 am
by acheesem
More information, is that if I even just change it to a simple shell that logs a message, it doesn't log anything (or run at all)

However if I ssh onto another server and run a command, that works fine. So i feel like its something environment related that is is preventing my shell scripts from running.

When I turned debug_level to -1 and debug_verbosity to 2, I can see the final output, but I can't see it running or the output of running the command.

I see this in my logs surrounding my output

Code: Select all

[1594870348.072368] [2048.1] [pid=10444]   Done.  Final output: '/home/opsgenie/oec/opsgenie-nagiosxi/s2o -entityType=service -t="PROBLEM" -ldt="Thu Jul 16 15:32:28 NZST 2020" -hn="x-files.aut.ac.nz" -hdn="x-files.aut.ac.nz" -hal="x-files" -haddr="156.62.1.5" -hs="UP" -hsi="0" -lhs="UP" -lhsi="0" -hst="HARD" -ha="1" -mha="2" -hei="28902684" -lhei="28902647" -hpi="0" -lhpi="13975568" -hl="0.443" -het="0.015" -hd="0d 23h 33m 4s" -hds="84784" -hdt="0" -hpc="0.00" -hgn="VIEWGROUP-ISPROD" -hgns="VIEWGROUP-ISPROD,VIEWGROUP-ICT-PROD,VIEWGROUP-CYBER-PROD,VIEWGROUP-CYBER,SPONG-CYB-MON,OS-Linux-Slackware-15.0,OS-Linux-Slackware,OS-Linux,NAGIOS-IPD-SYNC" -lhc="1594870287" -lhsc="1594785564" -lhu="1594870287" -lhd="1594785564" -lhur="0" -ho="OK - 156.62.1.5 rta 1.020ms lost 0%" -lho="" -hpd="rta=1.020ms;3000.000;5000.000;0; pl=0%;80;100;0;100 rtmax=1.082ms;;;; rtmin=0.883ms;;;;" -s="Forced Failure" -sdn="Forced Failure" -ss="CRITICAL" -ssi="2" -lss="OK" -lssi="0" -sst="HARD" -sa="1" -msa="1" -siv="0" -sei="28908084" -lsei="28908074" -spi="13978104" -lspi="13978065" -sl="129.695" -set="0.090" -sd="0d 0h 0m 3s" -sds="3" -sdt="0" -spc="23.42" -sgn="$SERVICEGROUPNAME$" -sgns="" -lsch="1594870345" -lssc="1594870345" -lsok="1594870345" -lsw="0" -lsu="1594785683" -lsc="1594870230" -so="FAIL CRITICAL - Forced Failure!" -lso="" -snu="" -spd=""'
[1594870348.072373] [2048.1] [pid=10444] **** END MACRO PROCESSING *************
[1594870348.072442] [001.0] [pid=10444] process_macros_r()
[1594870348.072447] [2048.1] [pid=10444] **** BEGIN MACRO PROCESSING ***********
[1594870348.072451] [2048.1] [pid=10444] Processing: 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;$SERVICESTATE$;notify-service-by-opsgenie-cyb2;$SERVICEOUTPUT$
[1594870348.072454] [2048.2] [pid=10444]   Processing part: 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;'
[1594870348.072459] [2048.2] [pid=10444]   Not currently in macro.  Running output (67): 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;'
[1594870348.072462] [2048.2] [pid=10444]   Processing part: 'SERVICESTATE'
[1594870348.072466] [2048.2] [pid=10444]   macros[4] (SERVICESTATE) match.
[1594870348.072469] [2048.2] [pid=10444]   Processed 'SERVICESTATE', Free: 0
[1594870348.072473] [2048.2] [pid=10444]   Processed 'SERVICESTATE', Free: 0,  Cleaning options: 0
[1594870348.072478] [2048.2] [pid=10444]   Uncleaned macro.  Running output (75): 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;CRITICAL'
[1594870348.072481] [2048.2] [pid=10444]   Just finished macro.  Running output (75): 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;CRITICAL'
[1594870348.072485] [2048.2] [pid=10444]   Processing part: ';notify-service-by-opsgenie-cyb2;'
[1594870348.072489] [2048.2] [pid=10444]   Not currently in macro.  Running output (108): 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;CRITICAL;notify-service-by-opsgenie-cyb2;'
[1594870348.072492] [2048.2] [pid=10444]   Processing part: 'SERVICEOUTPUT'
[1594870348.072501] [2048.2] [pid=10444]   macros[17] (SERVICEOUTPUT) match.
[1594870348.072504] [2048.2] [pid=10444]   Processed 'SERVICEOUTPUT', Free: 0
[1594870348.072508] [2048.2] [pid=10444]   Processed 'SERVICEOUTPUT', Free: 0,  Cleaning options: 0
[1594870348.072513] [2048.2] [pid=10444]   Uncleaned macro.  Running output (139): 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;CRITICAL;notify-service-by-opsgenie-cyb2;FAIL CRITICAL - Forced Failure!'
[1594870348.072517] [2048.2] [pid=10444]   Just finished macro.  Running output (139): 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;CRITICAL;notify-service-by-opsgenie-cyb2;FAIL CRITICAL - Forced Failure!'
[1594870348.072520] [2048.2] [pid=10444]   Processing part: '
[1594870348.072524] [2048.2] [pid=10444]   Not currently in macro.  Running output (140): 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;CRITICAL;notify-service-by-opsgenie-cyb2;FAIL CRITICAL - Forced Failure!
[1594870348.072527] [2048.1] [pid=10444]   Done.  Final output: 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;CRITICAL;notify-service-by-opsgenie-cyb2;FAIL CRITICAL - Forced Failure!
[1594870348.072531] [2048.1] [pid=10444] **** END MACRO PROCESSING *************
[1594870348.072544] [064.1] [pid=10444] Making callbacks (type 2)...
[1594870348.072840] [064.2] [pid=10444] Callback #1 (type 2) return code = 0
[1594870348.072846] [001.0] [pid=10444] clear_volatile_macros_r()
[1594870348.072851] [064.2] [pid=10444] Callback #1 (type 21) return code = 206
[1594870348.072856] [064.1] [pid=10444] Making callbacks (type 21)...
[1594870348.072859] [001.0] [pid=10444] clear_volatile_macros_r()
[1594870348.072864] [001.0] [pid=10444] get_raw_command_line_r()
Assuming those callbacks are the command being run, I am surprised I can't see any output or anything ;/

Re: Debugging Host/Service notifications

Posted: Thu Jul 16, 2020 4:45 am
by acheesem
Ok, we finally bit the bucket and restarted the main nagiosxi server. And this has resolved our issues. All i can guess is some updates made something not work quite right.

The pain was not seeing any errors or even details for the notifications when they were run, so I am still lost as to how to see those.

but all in all my issue has been resolved as I have got the notifications working now.

you can close this ticket.


thanks
--Aaron

Re: Debugging Host/Service notifications

Posted: Thu Jul 16, 2020 7:59 am
by scottwilkerson
acheesem wrote:Ok, we finally bit the bucket and restarted the main nagiosxi server. And this has resolved our issues. All i can guess is some updates made something not work quite right.

The pain was not seeing any errors or even details for the notifications when they were run, so I am still lost as to how to see those.

but all in all my issue has been resolved as I have got the notifications working now.

you can close this ticket.


thanks
--Aaron
Glad to hear it is resolved!

Locking thread