Hi,
I am currently trying to debug host and service notifications, they're being seen in the notification log and they're escalating to the right people. But the resulting alert is not being received (we are using opsgenie)
I've enabled debug logging to see the output command 'Final Output' and manually run it and it creates an alert as expected, so this leads me to believe there is some permission / environment issue occurring when it attempts to run the command.
I am trying to work out if there is a log or a way to record the notifications as they are sent and their output, so I can identify what is going wrong.
I have noticed in the Notification Log that it is listing the dispatcher as 'Custom: ' with no value after it, which seems odd now
any assistance would be appreciated.
cheers
--Aaron
System Details:
Nagios 5.7.2
Mod_Gearman 2 workers
CentOS Linux release 7.7.1908 64bit
Debugging Host/Service notifications
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Debugging Host/Service notifications
Hi Aaron,
When you are testing the command manually, make sure you are logged in as the nagios user ( su - nagios ). If it's not working, then you'll need to update the permissions on the notification handler for opsgenie. They should be as follows:
Otherwise, if that's not it, it would be helpful to review the configurations in the system profile. Thanks, Benjamin
To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message or upload it to the post/ticket, and then reply to this post to bring it up in the queue.
When you are testing the command manually, make sure you are logged in as the nagios user ( su - nagios ). If it's not working, then you'll need to update the permissions on the notification handler for opsgenie. They should be as follows:
Code: Select all
chown apache:nagios
chmod 775
To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message or upload it to the post/ticket, and then reply to this post to bring it up in the queue.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Debugging Host/Service notifications
Yup tested as nagios user, seems to run ok still.benjaminsmith wrote:Hi Aaron,
When you are testing the command manually, make sure you are logged in as the nagios user ( su - nagios ). If it's not working, then you'll need to update the permissions on the notification handler for opsgenie. They should be as follows:Otherwise, if that's not it, it would be helpful to review the configurations in the system profile. Thanks, BenjaminCode: Select all
chown apache:nagios chmod 775
To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message or upload it to the post/ticket, and then reply to this post to bring it up in the queue.
I'll send through the profile.
thanks!
Re: Debugging Host/Service notifications
More information, is that if I even just change it to a simple shell that logs a message, it doesn't log anything (or run at all)
However if I ssh onto another server and run a command, that works fine. So i feel like its something environment related that is is preventing my shell scripts from running.
When I turned debug_level to -1 and debug_verbosity to 2, I can see the final output, but I can't see it running or the output of running the command.
I see this in my logs surrounding my output
Assuming those callbacks are the command being run, I am surprised I can't see any output or anything ;/
However if I ssh onto another server and run a command, that works fine. So i feel like its something environment related that is is preventing my shell scripts from running.
When I turned debug_level to -1 and debug_verbosity to 2, I can see the final output, but I can't see it running or the output of running the command.
I see this in my logs surrounding my output
Code: Select all
[1594870348.072368] [2048.1] [pid=10444] Done. Final output: '/home/opsgenie/oec/opsgenie-nagiosxi/s2o -entityType=service -t="PROBLEM" -ldt="Thu Jul 16 15:32:28 NZST 2020" -hn="x-files.aut.ac.nz" -hdn="x-files.aut.ac.nz" -hal="x-files" -haddr="156.62.1.5" -hs="UP" -hsi="0" -lhs="UP" -lhsi="0" -hst="HARD" -ha="1" -mha="2" -hei="28902684" -lhei="28902647" -hpi="0" -lhpi="13975568" -hl="0.443" -het="0.015" -hd="0d 23h 33m 4s" -hds="84784" -hdt="0" -hpc="0.00" -hgn="VIEWGROUP-ISPROD" -hgns="VIEWGROUP-ISPROD,VIEWGROUP-ICT-PROD,VIEWGROUP-CYBER-PROD,VIEWGROUP-CYBER,SPONG-CYB-MON,OS-Linux-Slackware-15.0,OS-Linux-Slackware,OS-Linux,NAGIOS-IPD-SYNC" -lhc="1594870287" -lhsc="1594785564" -lhu="1594870287" -lhd="1594785564" -lhur="0" -ho="OK - 156.62.1.5 rta 1.020ms lost 0%" -lho="" -hpd="rta=1.020ms;3000.000;5000.000;0; pl=0%;80;100;0;100 rtmax=1.082ms;;;; rtmin=0.883ms;;;;" -s="Forced Failure" -sdn="Forced Failure" -ss="CRITICAL" -ssi="2" -lss="OK" -lssi="0" -sst="HARD" -sa="1" -msa="1" -siv="0" -sei="28908084" -lsei="28908074" -spi="13978104" -lspi="13978065" -sl="129.695" -set="0.090" -sd="0d 0h 0m 3s" -sds="3" -sdt="0" -spc="23.42" -sgn="$SERVICEGROUPNAME$" -sgns="" -lsch="1594870345" -lssc="1594870345" -lsok="1594870345" -lsw="0" -lsu="1594785683" -lsc="1594870230" -so="FAIL CRITICAL - Forced Failure!" -lso="" -snu="" -spd=""'
[1594870348.072373] [2048.1] [pid=10444] **** END MACRO PROCESSING *************
[1594870348.072442] [001.0] [pid=10444] process_macros_r()
[1594870348.072447] [2048.1] [pid=10444] **** BEGIN MACRO PROCESSING ***********
[1594870348.072451] [2048.1] [pid=10444] Processing: 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;$SERVICESTATE$;notify-service-by-opsgenie-cyb2;$SERVICEOUTPUT$
[1594870348.072454] [2048.2] [pid=10444] Processing part: 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;'
[1594870348.072459] [2048.2] [pid=10444] Not currently in macro. Running output (67): 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;'
[1594870348.072462] [2048.2] [pid=10444] Processing part: 'SERVICESTATE'
[1594870348.072466] [2048.2] [pid=10444] macros[4] (SERVICESTATE) match.
[1594870348.072469] [2048.2] [pid=10444] Processed 'SERVICESTATE', Free: 0
[1594870348.072473] [2048.2] [pid=10444] Processed 'SERVICESTATE', Free: 0, Cleaning options: 0
[1594870348.072478] [2048.2] [pid=10444] Uncleaned macro. Running output (75): 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;CRITICAL'
[1594870348.072481] [2048.2] [pid=10444] Just finished macro. Running output (75): 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;CRITICAL'
[1594870348.072485] [2048.2] [pid=10444] Processing part: ';notify-service-by-opsgenie-cyb2;'
[1594870348.072489] [2048.2] [pid=10444] Not currently in macro. Running output (108): 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;CRITICAL;notify-service-by-opsgenie-cyb2;'
[1594870348.072492] [2048.2] [pid=10444] Processing part: 'SERVICEOUTPUT'
[1594870348.072501] [2048.2] [pid=10444] macros[17] (SERVICEOUTPUT) match.
[1594870348.072504] [2048.2] [pid=10444] Processed 'SERVICEOUTPUT', Free: 0
[1594870348.072508] [2048.2] [pid=10444] Processed 'SERVICEOUTPUT', Free: 0, Cleaning options: 0
[1594870348.072513] [2048.2] [pid=10444] Uncleaned macro. Running output (139): 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;CRITICAL;notify-service-by-opsgenie-cyb2;FAIL CRITICAL - Forced Failure!'
[1594870348.072517] [2048.2] [pid=10444] Just finished macro. Running output (139): 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;CRITICAL;notify-service-by-opsgenie-cyb2;FAIL CRITICAL - Forced Failure!'
[1594870348.072520] [2048.2] [pid=10444] Processing part: '
[1594870348.072524] [2048.2] [pid=10444] Not currently in macro. Running output (140): 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;CRITICAL;notify-service-by-opsgenie-cyb2;FAIL CRITICAL - Forced Failure!
[1594870348.072527] [2048.1] [pid=10444] Done. Final output: 'SERVICE NOTIFICATION: itpager-cyb;x-files.aut.ac.nz;Forced Failure;CRITICAL;notify-service-by-opsgenie-cyb2;FAIL CRITICAL - Forced Failure!
[1594870348.072531] [2048.1] [pid=10444] **** END MACRO PROCESSING *************
[1594870348.072544] [064.1] [pid=10444] Making callbacks (type 2)...
[1594870348.072840] [064.2] [pid=10444] Callback #1 (type 2) return code = 0
[1594870348.072846] [001.0] [pid=10444] clear_volatile_macros_r()
[1594870348.072851] [064.2] [pid=10444] Callback #1 (type 21) return code = 206
[1594870348.072856] [064.1] [pid=10444] Making callbacks (type 21)...
[1594870348.072859] [001.0] [pid=10444] clear_volatile_macros_r()
[1594870348.072864] [001.0] [pid=10444] get_raw_command_line_r()
Re: Debugging Host/Service notifications
Ok, we finally bit the bucket and restarted the main nagiosxi server. And this has resolved our issues. All i can guess is some updates made something not work quite right.
The pain was not seeing any errors or even details for the notifications when they were run, so I am still lost as to how to see those.
but all in all my issue has been resolved as I have got the notifications working now.
you can close this ticket.
thanks
--Aaron
The pain was not seeing any errors or even details for the notifications when they were run, so I am still lost as to how to see those.
but all in all my issue has been resolved as I have got the notifications working now.
you can close this ticket.
thanks
--Aaron
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Debugging Host/Service notifications
Glad to hear it is resolved!acheesem wrote:Ok, we finally bit the bucket and restarted the main nagiosxi server. And this has resolved our issues. All i can guess is some updates made something not work quite right.
The pain was not seeing any errors or even details for the notifications when they were run, so I am still lost as to how to see those.
but all in all my issue has been resolved as I have got the notifications working now.
you can close this ticket.
thanks
--Aaron
Locking thread