I am trying to figure out the best way to configure my alerting work flow. For example, on an individual host I would like to have 24x7 alerts for a ping check. This would alert obviously 24x7 if the server was down. However, I would like to have a check on the Disk Usage for the C: and D: that only alert during work hours. This is all easy enough to do except that I have a lot of servers and want to create a template or group that I can set these settings and apply them to the hosts. My question is do I have to set this up for each individual service or is there a way to create a service group that will do this? That is how it should work in my head. However, under Service Group Management, I only see this options for service group name, members(host services) and service group members. I can go in and click on the services from each host to accomplish this, but that is not really scalable.
I am using NagiosXI 2011 R2.3.
Any help would be appreciated.
alert workflow
Re: alert workflow
This is essentially what templates are for. You can define these configuration directives in your template, and then leave them blank in the service definition, and the service will inherit whatever values are in the template.
You can also create a single service definition, apply it to a list of hosts, or even apply it to a service group.
You can also create a single service definition, apply it to a list of hosts, or even apply it to a service group.
Re: alert workflow
I guess that is where my issue lies. I am not exactly sure how they need to be nested. Do you create services and apply them to servicegroups and then to hosts? Or servicegroups to services and then to hostgroups? I am trying to get that workflow figured out.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: alert workflow
take the servicegroups and hostgroups out of the equation for a minute.
If you want for example these settings to apply to all services the C: & d: drives, if you look at them in
Configure -> CCM -> Services -> Modify
You will see that they already have the following template applied
Now, anything that is applied to this template will be applied to all services that use the template (unless the specific service overrides the value).
So, if you go to
Configure -> CCM -> templates -> xiwizard_windowsserver_nsclient_service -> Modify
And then make changes here to the Notification period* it will be applied to all of the services that use this template and do not have a different value specified at the service level.
If you want for example these settings to apply to all services the C: & d: drives, if you look at them in
Configure -> CCM -> Services -> Modify
You will see that they already have the following template applied
Code: Select all
xiwizard_windowsserver_nsclient_serviceSo, if you go to
Configure -> CCM -> templates -> xiwizard_windowsserver_nsclient_service -> Modify
And then make changes here to the Notification period* it will be applied to all of the services that use this template and do not have a different value specified at the service level.
Re: alert workflow
That is exactly the answer I was looking for. I have seen those templates but was unaware that it would work that way. That is exactly how I need to manage my notification times. I was obviously way overthingking it. So I would assume that I could then build escalations off of that template?
Re: alert workflow
As soon as I modify that template, I get this:
Warning: Duplicate definition found for service 'Drive C: Disk Usage' on host ####.net' (config file '/usr/local/nagios/etc/services/########.net.cfg', starting on line 31)
Warning: Duplicate definition found for service 'CPU Usage' on host ####.net' (config file '/usr/local/nagios/etc/services/#######.net.cfg', starting on line 14)
Warning: Duplicate definition found for service 'Uptime' on host ####.net' (config file '/usr/local/nagios/etc/services/#####.net.cfg', starting on line 109)
Warning: Duplicate definition found for service 'Memory Usage' on host ####.net' (config file '/usr/local/nagios/etc/services/####.net.cfg', starting on line 62)
Warning: Duplicate definition found for service 'Drive D: Disk Usage' on host ####.net' (config file '/usr/local/nagios/etc/services/####.net.cfg', starting on line 46)
It also makes Nagios freak out and start sending TONS of alerts.
Warning: Duplicate definition found for service 'Drive C: Disk Usage' on host ####.net' (config file '/usr/local/nagios/etc/services/########.net.cfg', starting on line 31)
Warning: Duplicate definition found for service 'CPU Usage' on host ####.net' (config file '/usr/local/nagios/etc/services/#######.net.cfg', starting on line 14)
Warning: Duplicate definition found for service 'Uptime' on host ####.net' (config file '/usr/local/nagios/etc/services/#####.net.cfg', starting on line 109)
Warning: Duplicate definition found for service 'Memory Usage' on host ####.net' (config file '/usr/local/nagios/etc/services/####.net.cfg', starting on line 62)
Warning: Duplicate definition found for service 'Drive D: Disk Usage' on host ####.net' (config file '/usr/local/nagios/etc/services/####.net.cfg', starting on line 46)
It also makes Nagios freak out and start sending TONS of alerts.
Re: alert workflow
I think I figured it out. I do not need to add any hosts when altering this template, just the checks and alerts times.
Re: alert workflow
Alright. I have figured that whole thing out. But unfortunately, there are a lot of services that use this template. Some of those services I want to monitor 24x7. How do I break those out with out having to do it by hand? Do I just create a new template and add it?
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: alert workflow
Yep, if you create a new template, you can only make the specific changes you want, add it to the services you want to use the restricted hours on.krdj wrote: Do I just create a new template and add it?
Re: alert workflow
I understand that. Here is my confusion.
C: D: monitor WORKHOURS
Uptime monitor WORKHOURS
Active Directory Domain Services monitor 24x7
Ping monitor 24x7
There are more but you see what the end goal is. The problem is they all use the xiwizard_windowsserver_nsclient_service. I understand that I can create a secondary template and apply it to the different services but this is where I lose it. How is this scalable? I would have to go into each service I want to only monitor WORKHOURS and add the secondary template?
I apologize if I am completely missing it here, the lightbulb has just not came on yet.
C: D: monitor WORKHOURS
Uptime monitor WORKHOURS
Active Directory Domain Services monitor 24x7
Ping monitor 24x7
There are more but you see what the end goal is. The problem is they all use the xiwizard_windowsserver_nsclient_service. I understand that I can create a secondary template and apply it to the different services but this is where I lose it. How is this scalable? I would have to go into each service I want to only monitor WORKHOURS and add the secondary template?
I apologize if I am completely missing it here, the lightbulb has just not came on yet.