That config doesn't match up with the logs - the logs show a notification to an 'itnetworksupport' contact. There isn't one of those in the conf.d zip (it has an itnetworksupport1, but not an itnetworksupport).
Looking at the F drive service definition as provided in that zip, it has no contact or contact groups defined, and includes the generic-service-dba template. The generic-service-dba template has only a contact of itdba, no contact groups and includes the generic-service-template template. The generic-service-template template has no contact groups or contacts defined. Only the itdba contact should be being contacted for an F drive alert based on the config in the latest zip. The G drive service definition in that zip has the same setup.
Code: Select all
melanogaster:nagios-tmp-support millisa$ cat *.cfg |grep itnetworksupport
contact_name itnetworksupport1
members itnetworksupport1
I suspect the configs were edited between when those logs showing the notifications and the zip up the conf files happened with more changes than just removing email addresses you want to keep private. This will make this last example difficult to explain why you saw a notification to both contacts. If I had to guess, that itnetworksupport1 was called itnetworksupport when you had the log entries, and was a member of the 'admins' contactgroup. That 'admins' contact_groups line was likely not commented out in the generic-service-template template at the time you got those notifications (this is supported by the file modtime of 12:47pm local of generic-service_nagios.cfg which is after when those logs show notifications - I understand needing to edit the real emails in contacts_nagios2.cfg, so that modtime should be after your log entries). In short - if you see the double notification again and want an explanation of why, avoid changing the config files for the service definition and template (except editing the email addresses from the contacts config).
I am certain this is the config you effectively had at the time the log entries came in:
Code: Select all
define contact{
contact_name itdba
alias IT DBA
use contact-template
email xxx2
}
define contact{
contact_name itnetworksupport
alias IT Network Support
use contact-template
email xxx3
}
define contactgroup{
contactgroup_name admins
alias Nagios Administrators
members itnetworksupport
}
define service{
use generic-service-dba
host_name 6800SRETDB
service_description F Drive
check_command check_snmp_disk!public!F!80%!90%
}
define service{
name generic-service-dba
use generic-service-template
contacts itdba
register 0
}
define service{
name generic-service-template ; The 'name' of this service template
active_checks_enabled 1 ; Active service checks are enabled
passive_checks_enabled 1 ; Passive service checks are enabled/accepted
parallelize_check 1 ; Active service checks should be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 ; We should obsess over this service (if necessary)
check_freshness 0 ; Default is to NOT check service 'freshness'
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
failure_prediction_enabled 1 ; Failure prediction is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
notification_interval 20 ; Only send notifications on status change by default.
is_volatile 0
check_period 24x7
normal_check_interval 5
retry_check_interval 1
max_check_attempts 4
notification_period 24x7
notification_options w,u,c,r
contact_groups admins
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
}
That config, with just that one contact_groups uncommented and the itnetworksupport without the '1' would give you exactly the notification you saw when the F drive went critical. The F drive check would include the generic-service-dba template which would cause itdba to get notified and that template would in turn include the generic-service-template template, which would notify itnetworksupport if that contact_groups for admins was uncommented. I can't say with absolute 100% certainty that the above was your config, but it seems likely based on what the logs showed and comparing this conf zip up with the earlier one.
At this point, I'm sure you are running into an inheritance issue and are ending up with a service definition that has both a contact and contact_group defined after it has included all the chained templates. Try this: explicitly define or undefine your contacts and contact_groups in the final service definition. For example, if you want the itdba user to be the only one who receives an alert for the F drive on 6800SRETDB use this config:
Code: Select all
define service{
use generic-service-dba
host_name 6800SRETDB
service_description F Drive
check_command check_snmp_disk!public!F!80%!90%
contacts itdba
contact_groups null
}
That will make absolutely sure that if there are any defined contact_groups in the template chain leading up to that service check, it'll get explicitly undefined for the F drive check. If you want to make sure just the contact_groups you define for a specific service check only go to the contact group and not any previously templated contact that may be being included in the chain, like say make it so only the erp-admins and no one else gets notified do this:
Code: Select all
define service{
use generic-service-dba
host_name 6800SRETDB
service_description F Drive
check_command check_snmp_disk!public!F!80%!90%
contacts null
contact_groups erp-admins
}
If there are any contacts defined in the generic-service-dba template or further up the chain that 'contacts null' setting is going to explicitly overrule it. This should at least assure you that there isn't some bug causing double notifications but it is being caused by something higher in the chain.
Though I don't suggest it for your actual configuration, if you are running into an inheritance issue, copying your service definition, followed by any templates it includes in the order they are included, one beneath the other will likely help you spot which contacts and contact_groups are taking precedence in your template chains.
Edit: Corrected a couple typos.