New notification command not working
New notification command not working
Command works from the nagios user account but does not work inside of Nagios. I see it fired off in the notifications log but I am not getting the messages.
Here is the command:
/usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | aws sns publish --topic-arn arn:aws:sns:us-west-2:024374954588:OnshoreitInternal --subject "Onshore IT Host Alert $HOSTNAME$ is $HOSTSTATE$" --message "$NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$"
If I run this from the command line under the nagios ID, it works. I have this ties to a contact (sns) AND a regular XI user (dflick) but it does not work on either.
Here is the command:
/usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | aws sns publish --topic-arn arn:aws:sns:us-west-2:024374954588:OnshoreitInternal --subject "Onshore IT Host Alert $HOSTNAME$ is $HOSTSTATE$" --message "$NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$"
If I run this from the command line under the nagios ID, it works. I have this ties to a contact (sns) AND a regular XI user (dflick) but it does not work on either.
Re: New notification command not working
Knowing the error messages that you're getting would help.
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoyd
I'm a Nagios Fanatic! • Join our public Nagios Discord Server!
Re: New notification command not working
Are you using SMTP or Sendmail? Have you checked in mail logs for clues?
Code: Select all
tail -100 /var/log/maillog
tail -100 /usr/local/nagiosxi/tmp/phpmailer.logBe sure to check out our Knowledgebase for helpful articles and solutions!
Re: New notification command not working
The notification is a SNS call to AWS so no mail server is used. I installed the AWS CLI on the server and I can send out the command string below from the command line as the nagios user and it works. I never see an error pop up and I am not sure where to look for one. Here is the command I have attached to a user as the notification command.:
/usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n" | aws sns publish --topic-arn arn:aws:sns:us-west-2:024374954588:OnshoreitInternal --subject "Onshore IT Service Alert $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$" --message "$NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$"
I attached a screenshot showing that the command was sent but I have no idea where to find the logs to see why it did not go out.
/usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n" | aws sns publish --topic-arn arn:aws:sns:us-west-2:024374954588:OnshoreitInternal --subject "Onshore IT Service Alert $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$" --message "$NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$"
I attached a screenshot showing that the command was sent but I have no idea where to find the logs to see why it did not go out.
You do not have the required permissions to view the files attached to this post.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: New notification command not working
This is out of scope for what Nagios XI includes, but have you verified that you can run the command from the CLI as the nagios?
for example, if you replace all the macros, can you run this command
We are not familiar with the aws sns service at all but is that how you call it by piping data AND setting a subject & message?
or would you normally call this to add an item to the queue?
for example, if you replace all the macros, can you run this command
Code: Select all
su nagios -c '/usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n" | aws sns publish --topic-arn arn:aws:sns:us-west-2:024374954588:OnshoreitInternal --subject "Onshore IT Service Alert $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$" --message "$NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$" 'or would you normally call this to add an item to the queue?
Code: Select all
aws sns publish --topic-arn arn:aws:sns:us-west-2:024374954588:OnshoreitInternal --subject "Onshore IT Service Alert $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$" --message "$NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$"Re: New notification command not working
I used the following guide to set it up.
https://chickencode.dreamwidth.org/2094.html
Both methods seem to work from the cli under the nagios ID. I am hoping you can give me a direction to look so I can figure out why it is being called but not working.
When you say "replace all of the macros" are the %HOSTNAME% variables or macros? I thought it was doing substitution only which is why I used them in the subject line and elsewhere.
https://chickencode.dreamwidth.org/2094.html
Both methods seem to work from the cli under the nagios ID. I am hoping you can give me a direction to look so I can figure out why it is being called but not working.
When you say "replace all of the macros" are the %HOSTNAME% variables or macros? I thought it was doing substitution only which is why I used them in the subject line and elsewhere.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: New notification command not working
did you see this in the guide?
What is the command you ran from the CLI that worked?
Are there errors in the nagios.log for your new command?
By default the user that runs the nagios service has its shell set to /bin/nologin but you can get around that by issuing as rootCode: Select all
su - nagios -s /bin/bash
All the items like $SERVICESTATE$dfmco wrote:When you say "replace all of the macros" are the %HOSTNAME% variables or macros?
What is the command you ran from the CLI that worked?
Are there errors in the nagios.log for your new command?
Code: Select all
grep notify-host-by-sns /usr/local/nagios/var/nagios.log
Re: New notification command not working
I apologize but I did not follow everything in your reply.
I did configure su - nagios -s /bin/bash. Is that an issue going forward?
So $SERVICESTATE$ is a placeholder for information, correct? I believe so as it is working now and it appears that $SERVICESTATE$ was replaced by the correct information which is why I assumed it was a variable replacement.
I ran the command from the CLI but I was in the home directory which is why it worked.
Here is the log but no detail as to why it had failed (command was not found due to bad path)
[1527261210] HOST NOTIFICATION: dflick;CAECD CTECC Nagios;DOWN;notify-host-by-sns;CRITICAL - 10.102.52.4: rta nan, lost 100%
[1527261210] HOST NOTIFICATION: sns;CAECD CTECC Nagios;DOWN;notify-host-by-sns;CRITICAL - 10.102.52.4: rta nan, lost 100%
[1527261272] HOST NOTIFICATION: dflick;CAECD BUC Nagios;DOWN;notify-host-by-sns;CRITICAL - 10.102.53.100: rta nan, lost 100%
[1527261272] HOST NOTIFICATION: sns;CAECD BUC Nagios;DOWN;notify-host-by-sns;CRITICAL - 10.102.53.100: rta nan, lost 100%
[1527264813] HOST NOTIFICATION: dflick;CAECD CTECC Nagios;DOWN;notify-host-by-sns;CRITICAL - 10.102.52.4: rta nan, lost 100%
[1527264813] HOST NOTIFICATION: sns;CAECD CTECC Nagios;DOWN;notify-host-by-sns;CRITICAL - 10.102.52.4: rta nan, lost 100%
[1527264887] HOST NOTIFICATION: dflick;CAECD BUC Nagios;DOWN;notify-host-by-sns;CRITICAL - 10.102.53.100: rta nan, lost 100%
[1527264887] HOST NOTIFICATION: sns;CAECD BUC Nagios;DOWN;notify-host-by-sns;CRITICAL - 10.102.53.100: rta nan, lost 100%
[1527266583] HOST NOTIFICATION: dflick;CAECD CTECC Nagios;UP;notify-host-by-sns;OK - 10.102.52.4: rta 33.684ms, lost 0%
[1527266583] HOST NOTIFICATION: sns;CAECD CTECC Nagios;UP;notify-host-by-sns;OK - 10.102.52.4: rta 33.684ms, lost 0%
[1527266662] HOST NOTIFICATION: dflick;CAECD BUC Nagios;UP;notify-host-by-sns;OK - 10.102.53.100: rta 29.481ms, lost 0%
[1527266662] HOST NOTIFICATION: sns;CAECD BUC Nagios;UP;notify-host-by-sns;OK - 10.102.53.100: rta 29.481ms, lost 0%
[1527280427] HOST NOTIFICATION: dflick;www.onshoreit.net;DOWN;notify-host-by-sns;CRITICAL - 172.30.30.80: Host unreachable @ 172.30.30.140. rta nan, lost 100%
[1527280427] HOST NOTIFICATION: sns;www.onshoreit.net;DOWN;notify-host-by-sns;CRITICAL - 172.30.30.80: Host unreachable @ 172.30.30.140. rta nan, lost 100%
[1527280471] HOST NOTIFICATION: dflick;www.onshoreit.net;UP;notify-host-by-sns;OK - 172.30.30.80: rta 480.248ms, lost 40%
[1527280471] HOST NOTIFICATION: sns;www.onshoreit.net;UP;notify-host-by-sns;OK - 172.30.30.80: rta 480.248ms, lost 40%
Not sure how I could have found that information in logs though. How could I have troubleshot better for this?
Glad it is working now but curious on how I could have done better.
I did configure su - nagios -s /bin/bash. Is that an issue going forward?
So $SERVICESTATE$ is a placeholder for information, correct? I believe so as it is working now and it appears that $SERVICESTATE$ was replaced by the correct information which is why I assumed it was a variable replacement.
I ran the command from the CLI but I was in the home directory which is why it worked.
Here is the log but no detail as to why it had failed (command was not found due to bad path)
[1527261210] HOST NOTIFICATION: dflick;CAECD CTECC Nagios;DOWN;notify-host-by-sns;CRITICAL - 10.102.52.4: rta nan, lost 100%
[1527261210] HOST NOTIFICATION: sns;CAECD CTECC Nagios;DOWN;notify-host-by-sns;CRITICAL - 10.102.52.4: rta nan, lost 100%
[1527261272] HOST NOTIFICATION: dflick;CAECD BUC Nagios;DOWN;notify-host-by-sns;CRITICAL - 10.102.53.100: rta nan, lost 100%
[1527261272] HOST NOTIFICATION: sns;CAECD BUC Nagios;DOWN;notify-host-by-sns;CRITICAL - 10.102.53.100: rta nan, lost 100%
[1527264813] HOST NOTIFICATION: dflick;CAECD CTECC Nagios;DOWN;notify-host-by-sns;CRITICAL - 10.102.52.4: rta nan, lost 100%
[1527264813] HOST NOTIFICATION: sns;CAECD CTECC Nagios;DOWN;notify-host-by-sns;CRITICAL - 10.102.52.4: rta nan, lost 100%
[1527264887] HOST NOTIFICATION: dflick;CAECD BUC Nagios;DOWN;notify-host-by-sns;CRITICAL - 10.102.53.100: rta nan, lost 100%
[1527264887] HOST NOTIFICATION: sns;CAECD BUC Nagios;DOWN;notify-host-by-sns;CRITICAL - 10.102.53.100: rta nan, lost 100%
[1527266583] HOST NOTIFICATION: dflick;CAECD CTECC Nagios;UP;notify-host-by-sns;OK - 10.102.52.4: rta 33.684ms, lost 0%
[1527266583] HOST NOTIFICATION: sns;CAECD CTECC Nagios;UP;notify-host-by-sns;OK - 10.102.52.4: rta 33.684ms, lost 0%
[1527266662] HOST NOTIFICATION: dflick;CAECD BUC Nagios;UP;notify-host-by-sns;OK - 10.102.53.100: rta 29.481ms, lost 0%
[1527266662] HOST NOTIFICATION: sns;CAECD BUC Nagios;UP;notify-host-by-sns;OK - 10.102.53.100: rta 29.481ms, lost 0%
[1527280427] HOST NOTIFICATION: dflick;www.onshoreit.net;DOWN;notify-host-by-sns;CRITICAL - 172.30.30.80: Host unreachable @ 172.30.30.140. rta nan, lost 100%
[1527280427] HOST NOTIFICATION: sns;www.onshoreit.net;DOWN;notify-host-by-sns;CRITICAL - 172.30.30.80: Host unreachable @ 172.30.30.140. rta nan, lost 100%
[1527280471] HOST NOTIFICATION: dflick;www.onshoreit.net;UP;notify-host-by-sns;OK - 172.30.30.80: rta 480.248ms, lost 40%
[1527280471] HOST NOTIFICATION: sns;www.onshoreit.net;UP;notify-host-by-sns;OK - 172.30.30.80: rta 480.248ms, lost 40%
Not sure how I could have found that information in logs though. How could I have troubleshot better for this?
Glad it is working now but curious on how I could have done better.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: New notification command not working
No this shouldn't be an issue.dfmco wrote:I did configure su - nagios -s /bin/bash. Is that an issue going forward?
that's correct.. I think the whole problem you were having is that when nagios runs a command it isn't in a directory so it needs full paths to everything which based on your other thread you discovered.dfmco wrote:So $SERVICESTATE$ is a placeholder for information, correct?
I think you did everything you could have, if you weren't in the nagios home directory when you tested the command you would have found the issue right away.dfmco wrote: Not sure how I could have found that information in logs though. How could I have troubleshot better for this?
Glad it is working now but curious on how I could have done better.
Re: New notification command not working
Thanks for the info. Please close the case as resolved. 