Page 1 of 3

handler behavior has changed since upgrading to XI 5.5.3

Posted: Thu Aug 30, 2018 11:33 am
by facc_se
Hi,
yesterday we upgraded from Nagios xi 5.4 to 5.5.3. Now we are noticing some different behavior for one of our handlers.

The handler is called 'enable_disable_ccis_https'. It is tied a service template. The service checks that use this template will check a web service status (using check_http) and obviously call the handler when the webservices are up or down. The handler's job is to call a local bash script on the Nagios server. The bash script's logic executes a remote sql server stored proc; but ONLY for "HARD" $SERVICESTATETYPE$. According to our bash scripts log, Nagios is sending "HARD" $SERVICESTATETYPE$ upon recovery of a previous "SOFT" state. The service is set to 5 max retries before it should consider the check as "HARD" down. See exerpts below from our script log (I put xxxx for the hostnames below for privacy reasons). The first example (from 8/29) was before the upgrade where things were working normally; the second example is post upgrade (today 8/30) where things are not working normally. We don't want the bash script to execute our stored proc after a soft down recovery. Is this a bug? What can we do to fix? If you need any additional, let me know.

------------pre upgrade

Wed Aug 29 00:12:04 EDT 2018
HOSTNAME = Duval_CCIS_Host-
SERVICEDESC = Duval_CCIS_Service-
SERVICESTATETYPE = SOFT
SERVICESTATE = CRITICAL
HOST_ADDRESS = xxxxxx.com
CHECK_COMMAND = check_ccis_https!/production/CCIS3Service/CCIS3Service.svc!443!!!!!!
SERVICEDISPLAYNAME = https://xxxxxxx.com/production/CCIS3Ser ... ervice.svc
URL = https://xxxxxx.com/production/CCIS3Serv ... ervice.svc
SOFT State....no action

Wed Aug 29 00:13:02 EDT 2018
HOSTNAME = Duval_CCIS_Host-
SERVICEDESC = Duval_CCIS_Service-
SERVICESTATETYPE = SOFT
SERVICESTATE = CRITICAL
HOST_ADDRESS = xxxxxx.com
CHECK_COMMAND = check_ccis_https!/production/CCIS3Service/CCIS3Service.svc!443!!!!!!
SERVICEDISPLAYNAME = https://xxxxx.com/production/CCIS3Servi ... ervice.svc
URL = https://xxxxx.com/production/CCIS3Servi ... ervice.svc
SOFT State....no action

Wed Aug 29 00:13:51 EDT 2018
HOSTNAME = Duval_CCIS_Host-
SERVICEDESC = Duval_CCIS_Service-
SERVICESTATETYPE = SOFT
SERVICESTATE = OK

HOST_ADDRESS = xxxxx.com
CHECK_COMMAND = check_ccis_https!/production/CCIS3Service/CCIS3Service.svc!443!!!!!!
SERVICEDISPLAYNAME = xxxxx.com/production/CCIS3Service/CCIS3Service.svc
URL = https://xxxxx.com/production/CCIS3Servi ... ervice.svc
SOFT State....no action


------------post upgrade

Thu Aug 30 10:20:31 EDT 2018
HOSTNAME = Pasco-CCIS-Host
SERVICEDESC = Pasco-CCIS-Service-
SERVICESTATETYPE = SOFT
SERVICESTATE = CRITICAL
HOST_ADDRESS = xxxxx
CHECK_COMMAND = check_ccis_https!/ClericusCCISService.asmx!45500!!!!!!
SERVICEDISPLAYNAME = https://xxxxx:45500/ClericusCCISService.asmx
URL = https://xxxxx:45500/ClericusCCISService.asmx
SOFT State....no action

Thu Aug 30 10:21:40 EDT 2018
HOSTNAME = Pasco-CCIS-Host
SERVICEDESC = Pasco-CCIS-Service-
SERVICESTATETYPE = SOFT
SERVICESTATE = CRITICAL
HOST_ADDRESS = xxxxx
CHECK_COMMAND = check_ccis_https!/ClericusCCISService.asmx!45500!!!!!!
SERVICEDISPLAYNAME = https://xxxxx:45500/ClericusCCISService.asmx
URL = https://xxxxx:45500/ClericusCCISService.asmx
SOFT State....no action

Thu Aug 30 10:22:42 EDT 2018
HOSTNAME = Pasco-CCIS-Host
SERVICEDESC = Pasco-CCIS-Service-
SERVICESTATETYPE = HARD
SERVICESTATE = OK

HOST_ADDRESS = xxxxx
CHECK_COMMAND = check_ccis_https!/ClericusCCISService.asmx!45500!!!!!!
SERVICEDISPLAYNAME = https://xxxxx:45500/ClericusCCISService.asmx
URL = https://xxxxx:45500/ClericusCCISService.asmx
***** Enabling CCIS Endpoint Pasco-CCIS-Service- https://xxxxx:45500/ClericusCCISService.asmx in database PRODCCISLR1 CCIS
https://xxxxx:45500/ClericusCCISService.asmx has been set to 1
--SUCCESS
sending email

Re: handler behavior has changed since upgrading to XI 5.5.3

Posted: Thu Aug 30, 2018 1:14 pm
by lmiltchev
I believe this is a bug that was introduced in Nagios Core 4.4.2. I filed the issue here:

https://github.com/NagiosEnterprises/na ... issues/575

and will be waiting on a response from our developers. Thank you!

Re: handler behavior has changed since upgrading to XI 5.5.3

Posted: Tue Sep 04, 2018 8:59 am
by facc_se
lmiltchev wrote:I believe this is a bug that was introduced in Nagios Core 4.4.2. I filed the issue here:

https://github.com/NagiosEnterprises/na ... issues/575

and will be waiting on a response from our developers. Thank you!
any news? We're getting a lot of un necessary notifications because of this bug.

Re: handler behavior has changed since upgrading to XI 5.5.3

Posted: Tue Sep 04, 2018 11:32 am
by lmiltchev
Our developers are aware of the issue. I don't have an ETA on a fix yet but as soon as we have a solution, it will be posted on GitHub.

Re: handler behavior has changed since upgrading to XI 5.5.3

Posted: Tue Sep 04, 2018 3:37 pm
by gornm565
Please let us know when a new Nagios Xi is released that fixes this issue.

Re: handler behavior has changed since upgrading to XI 5.5.3

Posted: Tue Sep 04, 2018 4:40 pm
by lmiltchev
Please let us know when a new Nagios Xi is released that fixes this issue.
Sure, we can do that.

Also, you can keep an eye on the change log here:
https://www.nagios.com/downloads/nagios-xi/change-log/
and here:
https://github.com/NagiosEnterprises/na ... issues/575

I don't know yet how the developers will proceed with this. They may add a patch to the next release of XI or make a new Core release, and include it in XI. In any case, we will update the changelog. Thank you!

Re: handler behavior has changed since upgrading to XI 5.5.3

Posted: Thu Oct 18, 2018 9:57 am
by facc_se
I noticed yesterday from your "change log" site that 2 new releases were released (5.5.4 and 5.5.5). saw the following note for 5.5.4 and thought that it might have been related to this issue:

•Fixed issue with Nagios Core notifications during downtime -SW

However, we applied 5.5.5 and the problematic behavior described in this post originally is not fixed. Please advise.

Re: handler behavior has changed since upgrading to XI 5.5.3

Posted: Thu Oct 18, 2018 10:14 am
by gornm565
Do you have an updated ETA on a fix for this?
Thanks.
lmiltchev wrote:
Please let us know when a new Nagios Xi is released that fixes this issue.
Sure, we can do that.

Also, you can keep an eye on the change log here:
https://www.nagios.com/downloads/nagios-xi/change-log/
and here:
https://github.com/NagiosEnterprises/na ... issues/575

I don't know yet how the developers will proceed with this. They may add a patch to the next release of XI or make a new Core release, and include it in XI. In any case, we will update the changelog. Thank you!

Re: handler behavior has changed since upgrading to XI 5.5.3

Posted: Thu Oct 18, 2018 3:00 pm
by lmiltchev
The issue I posted on github (Issue #575) is not fixed yet. It is different than the issue with notifications during downtime.
Fixed issue with Nagios Core notifications during downtime -SW
Unfortunately, I don't have an ETA on the fix at this time.

Re: handler behavior has changed since upgrading to XI 5.5.3

Posted: Mon Nov 05, 2018 9:41 am
by gornm565
Do you know if this was fixed in 5.5.6 ?
lmiltchev wrote:The issue I posted on github (Issue #575) is not fixed yet. It is different than the issue with notifications during downtime.
Fixed issue with Nagios Core notifications during downtime -SW
Unfortunately, I don't have an ETA on the fix at this time.