Page 1 of 2

Setting up notifications for switches

Posted: Fri Dec 06, 2019 6:04 pm
by Alan
Hello,

I have added a few switches to Nagios and I am having it monitor uptime and pinging it. It is working but the question I have is when I turn off the switch it does send me a email but where is that configured?

I have servers setup and have the contacts I have added to the contacts.cfg then in the ncpa.cfg I have contacts and contact_groups. I don't see and reference to the switches in the contacts.cfg file. And in the switch.cfg file i have a HOST GROUP DEFINITIONS setup as:

Code: Select all

define hostgroup{
        hostgroup_name  switches                ; The name of the hostgroup
        alias           Network_Switches        ; Long name of the group
        }
So Under my contacts.cfg I have it setup to get phone calls if certain servers go down and email if I get warning on disk space etc. I am trying to set this up for the switches so if they go down I would like to use the same contact group I use for phone calls. When I add the switches host group name to the contacts file under the group for calling I get this error:

Code: Select all

Error: Could not find any contact matching 'switches' (config file '/usr/local/nagios/etc/objects/contacts.cfg', starting on line 114)
Error: Failed to expand contacts for contactgroup 'calls' (config file '/usr/local/nagios/etc/objects/contacts.cfg', starting at line 114)
   Error processing object config files!
I am not sure how to setup the contacts for switches?

Re: Setting up notifications for switches

Posted: Fri Dec 06, 2019 6:23 pm
by Alan
Hello I ended up finding where it was pulling from. It is the templates.cfg off the name generic-switch.

Re: Setting up notifications for switches

Posted: Mon Dec 09, 2019 1:26 pm
by benjaminsmith
Hello Alan,

Can you post the host configuration object for the switches?

You can set the contact or contactgroups there. You can modify template but then every new host using generic-switch would get those same contact settings.

Re: Setting up notifications for switches

Posted: Wed Dec 11, 2019 11:50 am
by Alan
Here is the contents of the switch.cfg file.

Code: Select all

# Define the switch that we'll be monitoring

define host{
        use             generic-switch          ; Inherit default values from a template
        host_name       SW-Accessories         ; The name we're giving to this switch
        alias           SW-Accessories  ; A longer name associated with the switch
        address         172.17.250.3           ; IP address of the switch
        hostgroups      switches                ; Host groups this switch is associated with
        }

define host{     
        use             generic-switch
        host_name       SW-Data_Center
        alias           SW-Data_Center
        address         172.17.250.4
        hostgroups      switches
        }
		
define host{     
        use             generic-switch
        host_name       SW-Core_1
        alias           SW-Core_1
        address         172.17.250.1
        hostgroups      switches
        }
		
define host{     
        use             generic-switch
        host_name       SW-Core_2
        alias           SW-Core_2
        address         172.17.250.2
        hostgroups      switches
        }

		


# Create a new hostgroup for switches

define hostgroup{
        hostgroup_name  switches                ; The name of the hostgroup
        alias           Network_Switches        ; Long name of the group
        }

		


# Create a service to PING to switch

define service{
        use                     generic-service ; Inherit values from a template
        host_name               SW-Accessories ; The name of the host the service is associated with
        service_description     PING            ; The service description
        check_command           check_ping!200.0,20%!600.0,60%  ; The command used to monitor the service
        normal_check_interval   5               ; Check the service every 5 minutes under normal conditions
        retry_check_interval    1               ; Re-check the service every minute until its final/hard state is determined
        }

define service{
        use                     generic-service
        host_name               SW-Data_Center
        service_description     PING
        check_command           check_ping!200.0,20%!600.0,60%
        normal_check_interval   5
        retry_check_interval    1
        }
		
define service{
        use                     generic-service
        host_name               SW-Core_1
        service_description     PING
        check_command           check_ping!200.0,20%!600.0,60%
        normal_check_interval   5
        retry_check_interval    1
        }
		
define service{
        use                     generic-service
        host_name               SW-Core_2
        service_description     PING
        check_command           check_ping!200.0,20%!600.0,60%
        normal_check_interval   5
        retry_check_interval    1
        }
		


# Monitor uptime via SNMP

define service {
    use                 generic-service ; Inherit values from a template
    host_name           SW-Accessories
    service_description Uptime
    check_command       check_snmp!-C public -o .1.3.6.1.2.1.1.3.0
}

define service {
    use                 generic-service ; Inherit values from a template
    host_name           SW-Data_Center
    service_description Uptime
    check_command       check_snmp!-C public -o .1.3.6.1.2.1.1.3.0
}

define service {
    use                 generic-service ; Inherit values from a template
    host_name           SW-Core_1
    service_description Uptime
    check_command       check_snmp!-C public -o .1.3.6.1.2.1.1.3.0
}

define service {
    use                 generic-service ; Inherit values from a template
    host_name           SW-Core_2
    service_description Uptime
    check_command       check_snmp!-C public -o .1.3.6.1.2.1.1.3.0
}
Would I just add either a contact or contact group in the Define the switch that we'll be monitoring like below? Will this then look at the groups or contact in my contacts.cfg file where I have already setup everyone?

Code: Select all

# Define the switch that we'll be monitoring

define host{
        use             generic-switch          ; Inherit default values from a template
        host_name       SW-Accessories         ; The name we're giving to this switch
        alias           SW-Accessories  ; A longer name associated with the switch
        address         172.17.250.3           ; IP address of the switch
        hostgroups      switches                ; Host groups this switch is associated with
	     contact_groups  admins
	     contact         alan
        }

Re: Setting up notifications for switches

Posted: Wed Dec 11, 2019 1:29 pm
by benjaminsmith
Hi Alan,

Can you post the objects. cache file so I can see the complete configurations created from the templates? I'm not able find the contact related to the error below.
Error: Could not find any contact matching 'switches' (config file '/usr/local/nagios/etc/objects/contacts.cfg', starting on line 114)
Also, run the following command to check your configurations again and post any error messages?

Code: Select all

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg -v
Here's the full path to the objects.cache file. Thanks.

Code: Select all

/usr/local/nagios/var/objects.cache

Re: Setting up notifications for switches

Posted: Wed Dec 11, 2019 2:55 pm
by Alan
I don't get any errors when i run the

Code: Select all

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg -v
I did not try to add the contact_groups or contact to the switch.cfg file are you wanting me to add this then run the

Code: Select all

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg -v
This is the out put of that:

Code: Select all

Running pre-flight check on configuration data...

Checking objects...
        Checked 74 services.
        Checked 29 hosts.
        Checked 4 host groups.
        Checked 0 service groups.
        Checked 7 contacts.
        Checked 5 contact groups.
        Checked 25 commands.
        Checked 5 time periods.
        Checked 0 host escalations.
        Checked 0 service escalations.
Checking for circular paths...
        Checked 29 hosts
        Checked 0 service dependencies
        Checked 0 host dependencies
        Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0
I do get these WARNINGS above it though not sure what those are:

Website: https://www.nagios.org
Reading configuration data...
Read main config file okay...
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.

I have added the objects.cache file

Re: Setting up notifications for switches

Posted: Wed Dec 11, 2019 5:47 pm
by benjaminsmith
Hello Alan,

The configurations pass the check, and I reviewed the configurations in the objects.cache and they appear to be correct.

Code: Select all


define host {
	host_name	SW-Accessories
	alias	SW-Accessories
	address	<ip address>
	check_period	24x7
	check_command	check-host-alive
	contact_groups	admins
	notification_period	24x7
	initial_state	o
	importance	0
	check_interval	5.000000
	retry_interval	1.000000
	max_check_attempts	10
	active_checks_enabled	1
	passive_checks_enabled	1
	obsess	1
	event_handler_enabled	1
	low_flap_threshold	0.000000
	high_flap_threshold	0.000000
	flap_detection_enabled	1
	flap_detection_options	a
	freshness_threshold	0
	check_freshness	0
	notification_options	r,d
	notifications_enabled	1
	notification_interval	30.000000
	first_notification_delay	0.000000
	stalking_options	n
	process_perf_data	1
	retain_status_information	1
	retain_nonstatus_information	1
	}
Here's the default nagiosadmin contact the is a member of the admins group.

Code: Select all

define contact {
	contact_name	nagiosadmin
	alias	Nagios Admin
	service_notification_period	24x7
	host_notification_period	24x7
	service_notification_options	r,w,u,c,f,s
	host_notification_options	r,d,u,f,s
	service_notification_commands	notify-service-by-email
	host_notification_commands	notify-host-by-email
	email	[email protected]
	minimum_importance	0
	host_notifications_enabled	1
	service_notifications_enabled	1
	can_submit_commands	1
	retain_status_information	1
	retain_nonstatus_information	1
	}
At this point, the next steps to troubleshoot are:

1. Check the nagios log to make sure an alert was generated and a notification was sent by Nagios when the device was powered off.

Code: Select all

/usr/local/nagios/var/nagios.log
2. Check the maillog on the server to see if the notification was successfully sent/delivered.

Code: Select all

/var/log/maillog

Re: Setting up notifications for switches

Posted: Thu Dec 12, 2019 10:43 am
by Alan
So I am getting email alerts that is all working. I was just trying to figure out where that was configured. I found that was in the templates.cfg file. So I was just wandering how to add groups to be contacted. So I have a service that I pay for when it recieves an email it basically calls who ever is on call and reads the email to them. So I was needing to find where I could add that group but I think if I add it to the templates file it should work.

Then you mentioned above " Hello Alan,

Can you post the host configuration object for the switches?

You can set the contact or contactgroups there. You can modify template but then every new host using generic-switch would get those same contact settings."

So I am just wandering what is the better way to set the contacts for this? Should I just comment out the generic-switch in the templates file and add contact_groups then put the contact groups I need on the define host in the switch.cfg file? Sorry if I am creating confusion on anything.

Re: Setting up notifications for switches

Posted: Thu Dec 12, 2019 1:32 pm
by Alan
One other thing I am tying to get setup is how fast I am notified after a switch goes down. So I have change the check_interval, retry_interval, and max_check_attempts to all kinds of different values to try and get it to notify me within 30 seconds that the switch is down. Here is where I have it now:

Code: Select all

define host{
	name			    generic-switch	; The name of this host template
	use			    generic-host	; Inherit default values from the generic-host template
	check_period		    24x7		    ; By default, switches are monitored round the clock
	check_interval		    0.3				; Switches are checked every 5 minutes
	retry_interval		    0.3				; Schedule host check retries at 1 minute intervals
	max_check_attempts	    1				; Check each switch 10 times (max)
	check_command		    check-host-alive	; Default command to check if routers are "alive"
	notification_period	    24x7			; Send notifications at any time
	notification_interval	    30				; Resend notifications every 30 minutes
	notification_options	    d,r,u				; Only send notifications for specific host states
	contact_groups		    admins, calls		; Notifications get sent to the admins by default
	register		    0				; DONT REGISTER THIS - ITS JUST A TEMPLATE
	}
This seems to take about 45 seconds to notify me. Does it hurt anything have these values this low?

Re: Setting up notifications for switches

Posted: Thu Dec 12, 2019 2:51 pm
by benjaminsmith
Hi Alan,

Good to know the notifications are working for you. Generally, we wouldn't recommend going below 1 minute as network congestion, timeouts or other lags may lead to false-positive ( you'll receive too many notifications when nothing is wrong).

Also, I tested fractional units on my Nagios XI server, and the CCM will just round them to an integer value. If you want to go below one minute, you should change the interval length in the main configuration file ( /usr/local/nagios/etc/nagios.cfg ).
Format: interval_length=<seconds>
Example: interval_length=60
This is the number of seconds per "unit interval" used for timing in the scheduling queue, re-notifications, etc. "Units intervals" are used in the object configuration file to determine how often to run a service check, how often to re-notify a contact, etc.

Important: The default value for this is set to 60, which means that a "unit value" of 1 in the object configuration file will mean 60 seconds (1 minute). I have not really tested other values for this variable, so proceed at your own risk if you decide to do so!