Im faced another problem with nagios core 4.1.1 Installed on Centos 7.
Some hosts just not checked by shedule. After enabling full logging ive found some strange logs (this logs repeated every scheduled time to check).
I have around 90 cisco switches that need to be checked by ping. some of them cheched normally, and around 10 is not.
If i ran check by self with force checkbox checked, host checked fine and after again stale :/
Parent Host_Main is available at schedule time and when i force to check host by self.
And no dependencies is set for Host_1, just only one parent.
Any suggestion how to fix this?
Added: fast inspecting full logs, ive fount this issue not only for this host, its Nagios wide
Configs and log bellow:
** Host Check Event ==> Host: 'Host_1', Options: 8, Latency: 0.000000 sec
[1445865630.705614] [001.0] [pid=18801] run_scheduled_host_check()
[1445865630.705618] [016.0] [pid=18801] Attempting to run scheduled check of host 'Host_1': check options=8, latency=0.000000
[1445865630.705663] [001.0] [pid=18801] run_async_host_check(Host_1 ...)
[1445865630.705670] [016.0] [pid=18801] ** Running async check of host 'Host_1'...
[1445865630.705674] [016.0] [pid=18801] Host 'Host_1' passed first hurdle (caching/execution)
[1445865630.705677] [001.0] [pid=18801] check_host_check_viability()
[1445865630.705680] [001.0] [pid=18801] check_time_against_period()
[1445865630.705701] [001.0] [pid=18801] _get_matching_timerange()
[1445865630.705709] [001.0] [pid=18801] check_host_dependencies()
[1445865630.705719] [001.0] [pid=18801] check_host_dependencies()
[1445865630.705725] [016.0] [pid=18801] Host check dependencies failed
[1445865630.705728] [016.0] [pid=18801] Host check isn't viable at this point.
[1445865630.705730] [016.1] [pid=18801] Unable to run scheduled host check at this time
[1445865630.705732] [001.0] [pid=18801] get_next_valid_time()
[1445865630.705741] [001.0] [pid=18801] _get_matching_timerange()
[1445865630.705753] [016.1] [pid=18801] Rescheduled next host check for Mon Oct 26 16:30:30 2015
[1445865630.705757] [064.1] [pid=18801] Making callbacks (type 12)...
[1445865630.705759] [001.0] [pid=18801] schedule_host_check()
[1445865630.705766] [016.0] [pid=18801] Scheduling a non-forced, active check of host Host_1' @ Mon Oct 26 16:30:30 2015
host defenition is
define host{
name generic-switch-cisco
use generic-host,host-pnp
notifications_enabled 1
event_handler_enabled 1
flap_detection_enabled 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
check_interval 10
retry_check_interval 1
max_check_attempts 3
check_command check-host-alive-cisco
check_period 24x7
contact_groups admins
notification_interval 30
notification_period 24x7
notification_options d,u,r
register 0
}
define host{
use generic-switch-cisco
host_name Host_1
address 10.140.208.60
parents Host_Main
hostgroups Group_switches
}
define command{
command_name check-host-alive-cisco
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 200.0,20% -c 1200.0,30% -p 5
}