alert workflow

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
krdj
Posts: 7
Joined: Wed May 09, 2012 12:42 pm

alert workflow

Post by krdj »

I am trying to figure out the best way to configure my alerting work flow. For example, on an individual host I would like to have 24x7 alerts for a ping check. This would alert obviously 24x7 if the server was down. However, I would like to have a check on the Disk Usage for the C: and D: that only alert during work hours. This is all easy enough to do except that I have a lot of servers and want to create a template or group that I can set these settings and apply them to the hosts. My question is do I have to set this up for each individual service or is there a way to create a service group that will do this? That is how it should work in my head. However, under Service Group Management, I only see this options for service group name, members(host services) and service group members. I can go in and click on the services from each host to accomplish this, but that is not really scalable.

I am using NagiosXI 2011 R2.3.

Any help would be appreciated.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: alert workflow

Post by mguthrie »

This is essentially what templates are for. You can define these configuration directives in your template, and then leave them blank in the service definition, and the service will inherit whatever values are in the template.

You can also create a single service definition, apply it to a list of hosts, or even apply it to a service group.
krdj
Posts: 7
Joined: Wed May 09, 2012 12:42 pm

Re: alert workflow

Post by krdj »

I guess that is where my issue lies. I am not exactly sure how they need to be nested. Do you create services and apply them to servicegroups and then to hosts? Or servicegroups to services and then to hostgroups? I am trying to get that workflow figured out.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: alert workflow

Post by scottwilkerson »

take the servicegroups and hostgroups out of the equation for a minute.

If you want for example these settings to apply to all services the C: & d: drives, if you look at them in
Configure -> CCM -> Services -> Modify

You will see that they already have the following template applied

Code: Select all

xiwizard_windowsserver_nsclient_service
Now, anything that is applied to this template will be applied to all services that use the template (unless the specific service overrides the value).

So, if you go to
Configure -> CCM -> templates -> xiwizard_windowsserver_nsclient_service -> Modify

And then make changes here to the Notification period* it will be applied to all of the services that use this template and do not have a different value specified at the service level.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
krdj
Posts: 7
Joined: Wed May 09, 2012 12:42 pm

Re: alert workflow

Post by krdj »

That is exactly the answer I was looking for. I have seen those templates but was unaware that it would work that way. That is exactly how I need to manage my notification times. I was obviously way overthingking it. So I would assume that I could then build escalations off of that template?
krdj
Posts: 7
Joined: Wed May 09, 2012 12:42 pm

Re: alert workflow

Post by krdj »

As soon as I modify that template, I get this:

Warning: Duplicate definition found for service 'Drive C: Disk Usage' on host ####.net' (config file '/usr/local/nagios/etc/services/########.net.cfg', starting on line 31)
Warning: Duplicate definition found for service 'CPU Usage' on host ####.net' (config file '/usr/local/nagios/etc/services/#######.net.cfg', starting on line 14)
Warning: Duplicate definition found for service 'Uptime' on host ####.net' (config file '/usr/local/nagios/etc/services/#####.net.cfg', starting on line 109)
Warning: Duplicate definition found for service 'Memory Usage' on host ####.net' (config file '/usr/local/nagios/etc/services/####.net.cfg', starting on line 62)
Warning: Duplicate definition found for service 'Drive D: Disk Usage' on host ####.net' (config file '/usr/local/nagios/etc/services/####.net.cfg', starting on line 46)


It also makes Nagios freak out and start sending TONS of alerts.
krdj
Posts: 7
Joined: Wed May 09, 2012 12:42 pm

Re: alert workflow

Post by krdj »

I think I figured it out. I do not need to add any hosts when altering this template, just the checks and alerts times.
krdj
Posts: 7
Joined: Wed May 09, 2012 12:42 pm

Re: alert workflow

Post by krdj »

Alright. I have figured that whole thing out. But unfortunately, there are a lot of services that use this template. Some of those services I want to monitor 24x7. How do I break those out with out having to do it by hand? Do I just create a new template and add it?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: alert workflow

Post by scottwilkerson »

krdj wrote: Do I just create a new template and add it?
Yep, if you create a new template, you can only make the specific changes you want, add it to the services you want to use the restricted hours on.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
krdj
Posts: 7
Joined: Wed May 09, 2012 12:42 pm

Re: alert workflow

Post by krdj »

I understand that. Here is my confusion.

C: D: monitor WORKHOURS
Uptime monitor WORKHOURS

Active Directory Domain Services monitor 24x7
Ping monitor 24x7

There are more but you see what the end goal is. The problem is they all use the xiwizard_windowsserver_nsclient_service. I understand that I can create a secondary template and apply it to the different services but this is where I lose it. How is this scalable? I would have to go into each service I want to only monitor WORKHOURS and add the secondary template?

I apologize if I am completely missing it here, the lightbulb has just not came on yet.
Locked