OK, updated Nagios to the latest version. Also set the service "max checks" to 6 while leaving the "max checks" for hosts at 5
I'm still getting a string of service notifications before I get my "host down" notification. My impression is that the moment the host goes into "re check mode" (it's down but has not fired a notification) that the services on that host would stop checking all together until the host goes back to an "up" state.
What am I missing?
scottwilkerson wrote:
rkane wrote:Good to know - appreciate it!
scottwilkerson wrote:There were a few bugs in versions prior to 5.5.3 that could be related to this as well causing additional unexpected notifications.
Absolutely - is there a good way to do that other than screen shots?
Guessing I can pull the config files off the machine...where do the service config / templates live?
EDIT: found the service config and host config files, here they are:
###############################################################################
#
# Services configuration file
#
# Created by: Nagios CCM 2.7.0
# Date: 2018-09-19 11:21:27
# Version: Nagios Core 4.x
#
# --- DO NOT EDIT THIS FILE BY HAND ---
# Nagios CCM will overwrite all manual settings during the next update if you
# would like to edit files manually, place them in the 'static' directory or
# import your configs into the CCM by placing them in the 'import' directory.
#
###############################################################################
define service {
service_description CPU Usage
use xiwizard_ncpa_service
hostgroup_name Servers_Windows
check_command check_xi_ncpa!-t 'nagiosXI' -P 5693 -M cpu/percent -w 50 -c 80 -q 'aggregate=avg'!!!!!!!
_xiwizard ncpa
register 1
}
###############################################################################
#
# Services configuration file
#
# END OF FILE
#
###############################################################################
###############################################################################
#
# Hosts configuration file
#
# Created by: Nagios CCM 2.7.0
# Date: 2018-09-19 11:21:27
# Version: Nagios Core 4.x
#
# --- DO NOT EDIT THIS FILE BY HAND ---
# Nagios CCM will overwrite all manual settings during the next update if you
# would like to edit files manually, place them in the 'static' directory or
# import your configs into the CCM by placing them in the 'import' directory.
#
###############################################################################
define host {
host_name CONF-11106
use xiwizard_windowsserver_host_CONF
address CONF-11106
parents uts-core-a
icon_image win_server.png
statusmap_image win_server.png
_xiwizard windowsdesktop
register 1
}
###############################################################################
#
# Hosts configuration file
#
# END OF FILE
#
###############################################################################
These are both pointed at a set of templates so finding those to post is likely key
scottwilkerson wrote:Can you share the config for one of the services that is sending notifications while the host is down?
You can also view service templates by going to the CCM > Templates > Service Templates, and clicking on the View Config button on the right hand side.
@rkane, I'd like to take a look at your service template -> xiwizard_ncpa_service and the host template -> xiwizard_windowsserver_host_CONF.
Or you can just upload the /usr/local/nagios/var/objects.cache file.
If the host was still in the soft state, the "host_down_disable_service_checks" option would not work.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
RE the soft state - if the host goes soft down, rechecks 5 times @ 1 minute each and then goes hard down. And the services recheck 6 times @ 1 minute each....how would the service be 'hard down' (ie - time to notify) two full minutes before the host?
npolovenko wrote:@rkane, I'd like to take a look at your service template -> xiwizard_ncpa_service and the host template -> xiwizard_windowsserver_host_CONF.
Or you can just upload the /usr/local/nagios/var/objects.cache file.
If the host was still in the soft state, the "host_down_disable_service_checks" option would not work.
You do not have the required permissions to view the files attached to this post.
@rkane, Based on the notification settings the host should be in a hard state 1 minute before its services. And because of the option in the nagios.cfg service checks should not send any notifications.
host_down_disable_service_checks=1
Have you restarted nagios after adding that option?
Can you run the state history and notification reports for a)the host that went down and b)one of its services that kept alerting?
That way we could actually make sure that the host was indeed in a hard state when Nagios sent out service notifications.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Agreed on the notifications and how they should work....that's why it's baffling to me that the services are notifying over two minutes BEFORE the host. I did not change or add that disable option, assume that's an out of the box setting? Will restart Nagios anyway to be safe.
Report set attached - notice the services are sending alerts first and are alerting on Attempt 1/5 rather than 5/5?
npolovenko wrote:@rkane, Based on the notification settings the host should be in a hard state 1 minute before its services. And because of the option in the nagios.cfg service checks should not send any notifications.
host_down_disable_service_checks=1
Have you restarted nagios after adding that option?
Can you run the state history and notification reports for a)the host that went down and b)one of its services that kept alerting?
That way we could actually make sure that the host was indeed in a hard state when Nagios sent out service notifications.
You do not have the required permissions to view the files attached to this post.