Check never goes to Critical Hard state
Posted: Tue Dec 18, 2018 9:24 am
We have a check for a process running that where the process was stopped. The checks went through the 3 retries, but stayed in a critical soft state instead of moving to critical hard. We forced the check into an OK state (changed the check to expect the process to be stopped, rechecked, then changed back to expect the process to running). After the changes, the check ran again, ran through all the retries and should have gone to Critical Hard. Instead, the logs showed Critical Soft with the correct amount of tries.
nagios.log was showing
I see that this problem has been reported (https://support.nagios.com/forum/viewto ... ft#p268187 and https://github.com/NagiosEnterprises/na ... issues/576).
Are there any updates on fixing this issue? We already fixed the problem with this specific check by deleting and recreating the service, but we will need a lasting fix to prevent this from happening again in our production environment.
Nagios XI 5.5.7 on RHEL 7.6 64bit VM's.
nagios.log was showing
Code: Select all
........;Service status for: ftpsvc;CRITICAL;SOFT;1;CRITICAL: ftpsvc is stopped (should be running)
........;Service status for: ftpsvc;CRITICAL;SOFT;2;CRITICAL: ftpsvc is stopped (should be running)
........;Service status for: ftpsvc;CRITICAL;SOFT;3;CRITICAL: ftpsvc is stopped (should be running)
........;Service status for: ftpsvc;CRITICAL;SOFT;3;CRITICAL: ftpsvc is stopped (should be running)
........;Service status for: ftpsvc;CRITICAL;SOFT;3;CRITICAL: ftpsvc is stopped (should be running)
........;Service status for: ftpsvc;CRITICAL;SOFT;3;CRITICAL: ftpsvc is stopped (should be running)
........;Service status for: ftpsvc;CRITICAL;SOFT;3;CRITICAL: ftpsvc is stopped (should be running)
........;Service status for: ftpsvc;CRITICAL;SOFT;3;CRITICAL: ftpsvc is stopped (should be running)
Are there any updates on fixing this issue? We already fixed the problem with this specific check by deleting and recreating the service, but we will need a lasting fix to prevent this from happening again in our production environment.
Nagios XI 5.5.7 on RHEL 7.6 64bit VM's.