Page 1 of 2

incorrect handler behavior (again)

Posted: Wed Apr 22, 2020 9:30 pm
by facc_se
Hi,
I am having a relapse of an issue that we've experience before in the following post:
https://support.nagios.com/forum/viewto ... 16&t=50067

The symptoms that I'm experiencing now are exactly the same that I originally reported in said above post and it is on the same handler.
This was previously fixed for us on XI version 5.5.7.
We are now running 5.6.7 and have been for quite some time. I'm not exactly sure when this issue cropped back up in our environment, but I can confirm that it is happening again. For convenience I will recap the issue again below, but pls refer to the post above too.

I realize that there is a newer version of XI available at the moment, but before I blindly apply the upgrade, can someone tell me if this is again a known issue with 5.6.7? has it been corrected with the latest version of XI?

--summary of issue:
Nagios is incorrectly marking a service recovery as "HARD" which is causing our handler to perform unnecessary actions.

--recent evidence:
Thu Apr 9 05:33:58 EDT 2020
HOSTNAME = Hamilton_CCIS_Host-
SERVICEDESC = HAMILTON_CCISPULL_SERVICE
SERVICESTATETYPE = SOFT
SERVICESTATE = CRITICAL
HOST_ADDRESS = xx.xx.xx.xx
CHECK_COMMAND = check_ccis_http!/CCISPullService!15504!!!!!!
SERVICEDISPLAYNAME = http://xx.xx.xx.xx:15504/CCISPullService
URL = http://xx.xx.xx.xx:15504/CCISPullService
SOFT State....no action

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Thu Apr 9 05:35:07 EDT 2020
HOSTNAME = Hamilton_CCIS_Host-
SERVICEDESC = HAMILTON_CCISPULL_SERVICE
SERVICESTATETYPE = SOFT
SERVICESTATE = CRITICAL
HOST_ADDRESS = xx.xx.xx.xx
CHECK_COMMAND = check_ccis_http!/CCISPullService!15504!!!!!!
SERVICEDISPLAYNAME = http://xx.xx.xx.xx:15504/CCISPullService
URL = http://xx.xx.xx.xx:15504/CCISPullService
SOFT State....no action


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Thu Apr 9 05:36:16 EDT 2020
HOSTNAME = Hamilton_CCIS_Host-
SERVICEDESC = HAMILTON_CCISPULL_SERVICE
SERVICESTATETYPE = SOFT
SERVICESTATE = CRITICAL
HOST_ADDRESS = xx.xx.xx.xx
CHECK_COMMAND = check_ccis_http!/CCISPullService!15504!!!!!!
SERVICEDISPLAYNAME = http://xx.xx.xx.xx:15504/CCISPullService
URL = http://xx.xx.xx.xx:15504/CCISPullService
SOFT State....no action


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Thu Apr 9 05:37:15 EDT 2020
HOSTNAME = Hamilton_CCIS_Host-
SERVICEDESC = HAMILTON_CCISPULL_SERVICE
SERVICESTATETYPE = SOFT
SERVICESTATE = CRITICAL
HOST_ADDRESS = xx.xx.xx.xx
CHECK_COMMAND = check_ccis_http!/CCISPullService!15504!!!!!!
SERVICEDISPLAYNAME = http://xx.xx.xx.xx:15504/CCISPullService
URL = http://xx.xx.xx.xx:15504/CCISPullService
SOFT State....no action


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Thu Apr 9 05:38:15 EDT 2020
HOSTNAME = Hamilton_CCIS_Host-
SERVICEDESC = HAMILTON_CCISPULL_SERVICE
SERVICESTATETYPE = HARD
SERVICESTATE = OK
HOST_ADDRESS = xx.xx.xx.xx
CHECK_COMMAND = check_ccis_http!/CCISPullService!15504!!!!!!
SERVICEDISPLAYNAME = http://xx.xx.xx.xx:15504/CCISPullService
URL = http://xx.xx.xx.xx:15504/CCISPullService
***** Enabling CCIS Endpoint HAMILTON_CCISPULL_SERVICE http://xx.xx.xx.xx:15504/CCISPullService in database PRODCCISLR1 CCIS
http://xx.xx.xx.xx:15504/CCISPullService has been set to 1
--SUCCESS
sending email

Re: incorrect handler behavior (again)

Posted: Thu Apr 23, 2020 1:01 pm
by cdienger
Please PM me a profile and I can look into this. One can be generated under Admin > System Config > System Profile > Download Profile, or from the command line with:

Code: Select all

/usr/local/nagiosxi/scripts/components/getprofile.sh 58285
The profile is then saved to:

/usr/local/nagiosxi/var/components/profile.zip

Re: incorrect handler behavior (again)

Posted: Thu Apr 23, 2020 2:16 pm
by facc_se
cdienger wrote:Please PM me a profile and I can look into this. One can be generated under Admin > System Config > System Profile > Download Profile, or from the command line with:

Code: Select all

/usr/local/nagiosxi/scripts/components/getprofile.sh 58285
The profile is then saved to:

/usr/local/nagiosxi/var/components/profile.zip
I have sent it to you. says it was delivered, but it is also still in my outbox so please confirm.

Re: incorrect handler behavior (again)

Posted: Thu Apr 23, 2020 4:40 pm
by cdienger
I received it and was able to reproduce. It looks like this is an issue on 5.6.14 as well so hold of on doing an upgrade for now and I will ping our dev team.

Re: incorrect handler behavior (again)

Posted: Thu Apr 23, 2020 4:50 pm
by cdienger

Re: incorrect handler behavior (again)

Posted: Mon May 04, 2020 1:56 pm
by facc_se
cdienger wrote:A bug has been filed: https://github.com/NagiosEnterprises/na ... issues/757
It shows closed on GitHub. When will this go into a release? I just checked and I don't see any new releases.

Re: incorrect handler behavior (again)

Posted: Mon May 04, 2020 4:45 pm
by cdienger
The 5.7 release will update to the Core version that has this fix. The 5.7 release will be out soon.

Re: incorrect handler behavior (again)

Posted: Tue May 05, 2020 7:58 am
by facc_se
we're also experiencing a slightly different problem with the same service template that I use for the service we're looking at in this post. The service template has Max check attempts set to 5. I have one particular service that is using this template that is being marked as "HARD" down the first time the service become available. I have dozens of other services using this same template and this doesn't occur. I've tried deleting this particular service and recreating it, but it still happens. Do I need to open a different support thread for this issue or do you believe it is probably related to the same bug?

Re: incorrect handler behavior (again)

Posted: Tue May 05, 2020 3:12 pm
by ssax
Is the host of the service in a problem state?

Re: incorrect handler behavior (again)

Posted: Thu May 07, 2020 2:03 pm
by facc_se
ssax wrote:Is the host of the service in a problem state?
The host is in a critical state because we are not allowed to ping this particular host....but we don't care about that, we only care about monitoring a web service....so basically we have host notifications disabled. I'll try disabling the host checks.