Page 1 of 1

No email notification when device goes off with no warning

Posted: Tue Mar 08, 2022 9:55 am
by childebrecht
Good morning. I work for a small ISP and we utilize Nagios for numerous devices. Everything works correctly except for we monitor some remote cable modems that utilize .
define command{
command_name check_inverter_status
command_line $USER1$/check_snmp -H $HOSTADDRESS$ -C Motorola -o .1.3.6.1.4.1.5591.1.4.2.1.24.1 -w @5:5 -c @2:2 -l "Inverter Status"

# 'check_batt_string' command definition
define command{
command_name check_batt_string
command_line $USER1$/check_snmp -H $HOSTADDRESS$ -C Motorola -o .1.3.6.1.4.1.5591.1.4.2.1.28.1 -w $ARG1$ -c $ARG2$ -l "Battery Voltage" -u "volts"
}
If the inverter status changes or battery string changes then we will receive an email alert like we should but if the devices immediately goes offline line without initiating a warning state first then we do not receive an alert.

define service{
use generic-service ; Name of service template to use
hostgroup_name ps-south-side,ps-north-side,ps-jaco,ps-wash-co,ps-mason-co
service_description inverter status
check_command check_inverter_status
contact_groups admins,pagers,art,pushover ; Notifications get sent out to everyone in the 'admins' group
notification_options w,u,c,r ; Send notifications about warning, unknown, critical, and recovery events
check_interval 5 ; Check the service every 5 minutes under normal conditions
retry_interval 1 ; Re-check the service every minute until its final/hard state is determined
notification_interval 60 ; Re-notify about service problems every hour
notification_period 24x7
}

define service{
use generic-service ; Name of service template to use
hostgroup_name ps-south-side,ps-north-side,ps-jaco,ps-wash-co,ps-mason-co
service_description battery strings
check_command check_batt_string!@3501:3799!@0:3500
contact_groups admins,pagers,art,pushover ; Notifications get sent out to everyone in the 'admins' group
check_interval 5 ; Check the service every 5 minutes under normal conditions
retry_interval 1 ; Re-check the service every minute until its final/hard state is determined
notification_interval 60 ; Re-notify about service problems every hour
notification_options u,c,r ; Send notifications about warning, unknown, critical, and recovery events
notification_period 24x7
}


define host{
use generic-ps-tp ; Inherit default values from a template
host_name XXXXXX ; The name we're giving to this switch
alias XXXXXXX ; A longer name associated with the switch
address hostname ; IP address of the switch
hostgroups ps-wash-co ; Host groups this switch is associated with
}

Is there possibly another command we should use in order to monitor the immediate offline state of the devices that utilize these services.

Thank you for any assistance you can give.

Chris