Nnnnoooooooo, Google Chrome crashed as I was hitting submit on this the first time. That sucks. Here it goes again! This time I'm writing this in OneNote and will paste it in
Background information: I'm a very experienced Windows sysadmin and network admin. I like to dip my toes in Linux from time to time, but I am not nearly as experienced in it as I am the other aspects of IT. I've setup a RHEL server on AWS to serve as our external monitoring and alert system to alert us when our major network services (email, internet, etc) are down and thus cannot send us alerts. I've configured the latest, as of last week, Nagios Core with Pushover alerts using https://github.com/jedda/OSX-Monitoring ... ushover.sh
Here's what's happening: The PING check sensor has a HARD failure, but no alerts are getting sent out. I've tried with email alerts and pushover alerts, neither work. Neither even show up in the logs, which makes me think that I've got something misconfigured in the hosts.cfg file or contacts.cfg file.
nagios.log :: I've updated the hosts.cfg from a reachable IP to an unreachable IP, restarted the nagios service, and then clicked "Re-schedule the next check of this service". This is the only way I've found to simulate a failure. Please let me know if there's an easier way to simulate failures!
Code: Select all
[1441166679] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;thisisatestthingy;PING;1441166675
[1441166684] SERVICE ALERT: thisisatestthingy;PING;CRITICAL;SOFT;1;CRITICAL - Network Unreachable (<unreachable_IP>)
[1441166749] SERVICE ALERT: thisisatestthingy;PING;CRITICAL;SOFT;2;PING CRITICAL - Packet loss = 100%
[1441166799] SERVICE ALERT: thisisatestthingy;PING;CRITICAL;SOFT;3;CRITICAL - Network Unreachable (<unreachable_IP>)
[1441166859] SERVICE ALERT: thisisatestthingy;PING;CRITICAL;HARD;4;CRITICAL - Network Unreachable (<unreachable_IP>)
Code: Select all
[1441208968] EXTERNAL COMMAND: SEND_CUSTOM_SVC_NOTIFICATION;thisisatestthingy;PING;0;nagiosadmin;jkl
[1441208968] SERVICE NOTIFICATION: admin1;thisisatestthingy;PING;CUSTOM (CRITICAL);notify-service-by-pushover;CRITICAL - Network Unreachable (<unreachable_IP>);nagiosadmin;jkl
hosts.cfg
Code: Select all
define host{
use my-template
address <unreachable_IP>
host_name thisisatestthingy
contact_groups admins
}
define service{
use local-service ; Name of service template to use
host_name thisisatestthingy
service_description PING
check_command check_ping!100.0,20%!500.0,60%
notifications_enabled 1
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24x7
flap_detection_enabled 0
flap_detection_options o,w,c,u
}templates.cfg
Code: Select all
define host{
name my-template ; The name of this host template
notifications_enabled 1 ; Host notifications are enabled
event_handler_enabled 1 ; Host event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
notification_period 24x7 ; Send host notifications at any time
notification_options d,r
contact_groups admins
max_check_attempts 99999
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}contacts.cfg :: I have tried with email and also with Pushover. Currently Pushover is set as the active alert type.
Code: Select all
define contact{
contact_name admin1
alias admin1 pushover notifications
service_notification_period 24x7
host_notification_period 24x7
host_notification_options d,u,r,f,s,n
service_notification_options w,u,c,r,f,s,n
service_notification_commands notify-service-by-pushover
; service_notification_commands notify-service-by-email
host_notification_commands notify-host-by-pushover
; host_notification_commands notify-host-by-email
_pushover_userkey <pushover_user_key>
_pushover_appkey <pushover_app_key>
_pushover_device admin1-iphone
email [email protected]
}
define contactgroup{
contactgroup_name admins
alias Nagios Administrators
members admin1
}
commands.cfg :: contains default email alert string, which I've verified the server can send emails just fine as well. Yes I know that I saved the Pushover script with dashes instead of underscores, but I highly doubt that's the issue -- again the alerts send when you tell it to send a custom notification
Code: Select all
define command{
command_name notify-host-by-pushover
command_line /usr/local/nagios/plugins/notify-by-pushover.sh -u "$_CONTACTPUSHOVER_USERKEY$" -a "$_CONTACTPUSHOVER_APPKEY$" -s "spacealarm" -t "$HOSTNAME$ is $HOSTSTATE$" -m "Status: $HOSTOUTPUT$"
}
define command{
command_name notify-service-by-pushover
command_line /usr/local/nagios/plugins/notify-by-pushover.sh -u "$_CONTACTPUSHOVER_USERKEY$" -a "$_CONTACTPUSHOVER_APPKEY$" -s "spacealarm" -t "$SERVICEDESC$ on $HOSTNAME$ is $SERVICESTATE$" -m "Status: $SERVICEOUTPUT$"
}Thank you for taking the time to read this lengthy email. Hopefully I provided enough information for this question to be answered without me providing anything additional
Sincerely,
A Windows admin working out of his element
Issue solved. I needed to remove the "n" flag from service/host_notification_options