Issue with service parents after upgrading to 4.4.0 / 4.4.1
Posted: Tue Jul 03, 2018 3:16 am
Hello,
after upgrading from 4.3.4 to 4.4.0 the service parents aren't working correct anymore. When a service enters an "unknown" or "error" state, rechecking always fails. When I force a recheck via webinterface, the time for "next scheduled check" raises, but "last check time" stays at the same value.
So I activated the debug log with following settings:
and rescheduled the check. These are the corresponding lines I found in nagios.debug file:
So it seems, that the check isn't executed because the parent check is failing, but the parent service (PING) ist in state OK.
This is only one example, I've seen this behaviour on other checks with parents and on a second nagios instance which I also upgraded to version 4.4.0. The upgrade to 4.4.1 didn't solve the Problem. The check runs without error, when I remove the parents line from the service definition.
The service definition as an example:
kind regards
Stephan
after upgrading from 4.3.4 to 4.4.0 the service parents aren't working correct anymore. When a service enters an "unknown" or "error" state, rechecking always fails. When I force a recheck via webinterface, the time for "next scheduled check" raises, but "last check time" stays at the same value.
So I activated the debug log with following settings:
Code: Select all
debug_level=16
debug_verbosity=2
Code: Select all
[1530604260.156238] [016.0] [pid=1139] Scheduling a forced, active check of service 'CPU Load' on host 'backup' @ Tue Jul 3 09:50:58 2018
[1530604260.156268] [016.2] [pid=1139] Found another service check event for this service @ Tue Jul 3 09:51:34 2018
[1530604260.156279] [016.2] [pid=1139] New service check event is forced and occurs before the existing event, so the new event will be used instead.
[1530604260.156291] [016.2] [pid=1139] Scheduling new service check event.
[1530604260.156324] [016.0] [pid=1139] Attempting to run scheduled check of service 'CPU Load' on host 'backup': check options=1, latency=0.000015
[1530604260.156355] [016.2] [pid=1139] Execution parents for this service failed, so it will not be actively checked.
[1530604260.156364] [016.1] [pid=1139] Unable to run scheduled service check at this time
[1530604260.156376] [016.1] [pid=1139] Rescheduled next service check for Tue Jul 3 09:56:00 2018
[1530604260.156385] [016.0] [pid=1139] Scheduling a forced, active check of service 'CPU Load' on host 'backup' @ Tue Jul 3 09:56:00 2018
[1530604260.156392] [016.2] [pid=1139] Scheduling new service check event.
[1530604262.285685] [016.0] [pid=1139] Attempting to run scheduled check of service 'CPU Usage' on host 'backup': check options=0, latency=0.000000
[1530604262.285745] [016.2] [pid=1139] Execution parents for this service failed, so it will not be actively checked.
[1530604262.285791] [016.1] [pid=1139] Unable to run scheduled service check at this time
[1530604262.285802] [016.1] [pid=1139] Rescheduled next service check for Tue Jul 3 09:56:02 2018
[1530604262.285810] [016.0] [pid=1139] Scheduling a non-forced, active check of service 'CPU Usage' on host 'backup' @ Tue Jul 3 09:56:02 2018
[1530604262.285822] [016.2] [pid=1139] Scheduling new service check event.
This is only one example, I've seen this behaviour on other checks with parents and on a second nagios instance which I also upgraded to version 4.4.0. The upgrade to 4.4.1 didn't solve the Problem. The check runs without error, when I remove the parents line from the service definition.
The service definition as an example:
Code: Select all
define service{
use generic-service,graphed-service
host_name backup
service_description CPU Load
parents PING
check_command check_nrpe!check_load
servicegroups system-health
}Stephan