Understanding notifications.....
Posted: Sat Jul 11, 2015 10:02 pm
I am relatively new to Nagios, but have a working system to learn from. It has over 30000+ devices. Person in charge of the Nagios system departed, until a replacement is found I am attempting to fill the gaps....version is Nagios® Core⢠Version 4.0.8
All I want is a when one device goes down hard, that an notification be sent to one particular
I have defined the person in the contact.cfg file.
And I have found the service that triggers the event, called devPing (I think).
I have also found the host.
define host {
# Site: ViaWest DataCenter - CIS Cage
use critical-router-XDS2 ; From: Host: XDS2-RTR-01
host_name XDS2-RTR-01
alias XDS2-DataSideCIS (XDS2-RTR-01)
display_name XDS2-DataSideCIS (XDS2-RTR-01)
address 10.X.X.X
;_device_id 4411
_community_string XXXXXXXXXX
_site_code XDS2
;_type Router.Ent
notes_url http://ns-cacti-1.ad.abcs.net/tools/vie ... DS2-RTR-01
_site_alias XDS2-DataSideCIS
parents XDS1-RTR-01, XDS2-RTR-02
hostgroups +Connection.Other, DataCenter.DataSideCIS, gLink.10_X_X_X-XDS2-XDS2-v, HW.Foundry.NetIron MLXe-8, Router.Data
Router.Ent, Site.XDS1, Site.XDS2
}
What I can not find is what I believe triggers the event, check_command? We do have host groups, and when this device goes down we do get an email to the admins. SO I know that part works. We just have so many .cfg files and the directory structure is huge, with over 1,600,000 service checks a day, I am getting mired down.
All I want is a when one device goes down hard, that an notification be sent to one particular
I have defined the person in the contact.cfg file.
And I have found the service that triggers the event, called devPing (I think).
I have also found the host.
define host {
# Site: ViaWest DataCenter - CIS Cage
use critical-router-XDS2 ; From: Host: XDS2-RTR-01
host_name XDS2-RTR-01
alias XDS2-DataSideCIS (XDS2-RTR-01)
display_name XDS2-DataSideCIS (XDS2-RTR-01)
address 10.X.X.X
;_device_id 4411
_community_string XXXXXXXXXX
_site_code XDS2
;_type Router.Ent
notes_url http://ns-cacti-1.ad.abcs.net/tools/vie ... DS2-RTR-01
_site_alias XDS2-DataSideCIS
parents XDS1-RTR-01, XDS2-RTR-02
hostgroups +Connection.Other, DataCenter.DataSideCIS, gLink.10_X_X_X-XDS2-XDS2-v, HW.Foundry.NetIron MLXe-8, Router.Data
Router.Ent, Site.XDS1, Site.XDS2
}
What I can not find is what I believe triggers the event, check_command? We do have host groups, and when this device goes down we do get an email to the admins. SO I know that part works. We just have so many .cfg files and the directory structure is huge, with over 1,600,000 service checks a day, I am getting mired down.