Hi,
yesterday we upgraded from Nagios xi 5.4 to 5.5.3. Now we are noticing some different behavior for one of our handlers.
The handler is called 'enable_disable_ccis_https'. It is tied a service template. The service checks that use this template will check a web service status (using check_http) and obviously call the handler when the webservices are up or down. The handler's job is to call a local bash script on the Nagios server. The bash script's logic executes a remote sql server stored proc; but ONLY for "HARD" $SERVICESTATETYPE$. According to our bash scripts log, Nagios is sending "HARD" $SERVICESTATETYPE$ upon recovery of a previous "SOFT" state. The service is set to 5 max retries before it should consider the check as "HARD" down. See exerpts below from our script log (I put xxxx for the hostnames below for privacy reasons). The first example (from 8/29) was before the upgrade where things were working normally; the second example is post upgrade (today 8/30) where things are not working normally. We don't want the bash script to execute our stored proc after a soft down recovery. Is this a bug? What can we do to fix? If you need any additional, let me know.
------------pre upgrade
Wed Aug 29 00:12:04 EDT 2018
HOSTNAME = Duval_CCIS_Host-
SERVICEDESC = Duval_CCIS_Service-
SERVICESTATETYPE = SOFT
SERVICESTATE = CRITICAL
HOST_ADDRESS = xxxxxx.com
CHECK_COMMAND = check_ccis_https!/production/CCIS3Service/CCIS3Service.svc!443!!!!!!
SERVICEDISPLAYNAME = https://xxxxxxx.com/production/CCIS3Ser ... ervice.svc
URL = https://xxxxxx.com/production/CCIS3Serv ... ervice.svc
SOFT State....no action
Wed Aug 29 00:13:02 EDT 2018
HOSTNAME = Duval_CCIS_Host-
SERVICEDESC = Duval_CCIS_Service-
SERVICESTATETYPE = SOFT
SERVICESTATE = CRITICAL
HOST_ADDRESS = xxxxxx.com
CHECK_COMMAND = check_ccis_https!/production/CCIS3Service/CCIS3Service.svc!443!!!!!!
SERVICEDISPLAYNAME = https://xxxxx.com/production/CCIS3Servi ... ervice.svc
URL = https://xxxxx.com/production/CCIS3Servi ... ervice.svc
SOFT State....no action
Wed Aug 29 00:13:51 EDT 2018
HOSTNAME = Duval_CCIS_Host-
SERVICEDESC = Duval_CCIS_Service-
SERVICESTATETYPE = SOFT
SERVICESTATE = OK
HOST_ADDRESS = xxxxx.com
CHECK_COMMAND = check_ccis_https!/production/CCIS3Service/CCIS3Service.svc!443!!!!!!
SERVICEDISPLAYNAME = xxxxx.com/production/CCIS3Service/CCIS3Service.svc
URL = https://xxxxx.com/production/CCIS3Servi ... ervice.svc
SOFT State....no action
------------post upgrade
Thu Aug 30 10:20:31 EDT 2018
HOSTNAME = Pasco-CCIS-Host
SERVICEDESC = Pasco-CCIS-Service-
SERVICESTATETYPE = SOFT
SERVICESTATE = CRITICAL
HOST_ADDRESS = xxxxx
CHECK_COMMAND = check_ccis_https!/ClericusCCISService.asmx!45500!!!!!!
SERVICEDISPLAYNAME = https://xxxxx:45500/ClericusCCISService.asmx
URL = https://xxxxx:45500/ClericusCCISService.asmx
SOFT State....no action
Thu Aug 30 10:21:40 EDT 2018
HOSTNAME = Pasco-CCIS-Host
SERVICEDESC = Pasco-CCIS-Service-
SERVICESTATETYPE = SOFT
SERVICESTATE = CRITICAL
HOST_ADDRESS = xxxxx
CHECK_COMMAND = check_ccis_https!/ClericusCCISService.asmx!45500!!!!!!
SERVICEDISPLAYNAME = https://xxxxx:45500/ClericusCCISService.asmx
URL = https://xxxxx:45500/ClericusCCISService.asmx
SOFT State....no action
Thu Aug 30 10:22:42 EDT 2018
HOSTNAME = Pasco-CCIS-Host
SERVICEDESC = Pasco-CCIS-Service-
SERVICESTATETYPE = HARD
SERVICESTATE = OK
HOST_ADDRESS = xxxxx
CHECK_COMMAND = check_ccis_https!/ClericusCCISService.asmx!45500!!!!!!
SERVICEDISPLAYNAME = https://xxxxx:45500/ClericusCCISService.asmx
URL = https://xxxxx:45500/ClericusCCISService.asmx
***** Enabling CCIS Endpoint Pasco-CCIS-Service- https://xxxxx:45500/ClericusCCISService.asmx in database PRODCCISLR1 CCIS
https://xxxxx:45500/ClericusCCISService.asmx has been set to 1
--SUCCESS
sending email
handler behavior has changed since upgrading to XI 5.5.3
Re: handler behavior has changed since upgrading to XI 5.5.3
I believe this is a bug that was introduced in Nagios Core 4.4.2. I filed the issue here:
https://github.com/NagiosEnterprises/na ... issues/575
and will be waiting on a response from our developers. Thank you!
https://github.com/NagiosEnterprises/na ... issues/575
and will be waiting on a response from our developers. Thank you!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: handler behavior has changed since upgrading to XI 5.5.3
any news? We're getting a lot of un necessary notifications because of this bug.lmiltchev wrote:I believe this is a bug that was introduced in Nagios Core 4.4.2. I filed the issue here:
https://github.com/NagiosEnterprises/na ... issues/575
and will be waiting on a response from our developers. Thank you!
Re: handler behavior has changed since upgrading to XI 5.5.3
Our developers are aware of the issue. I don't have an ETA on a fix yet but as soon as we have a solution, it will be posted on GitHub.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: handler behavior has changed since upgrading to XI 5.5.3
Please let us know when a new Nagios Xi is released that fixes this issue.
Re: handler behavior has changed since upgrading to XI 5.5.3
Sure, we can do that.Please let us know when a new Nagios Xi is released that fixes this issue.
Also, you can keep an eye on the change log here:
https://www.nagios.com/downloads/nagios-xi/change-log/
and here:
https://github.com/NagiosEnterprises/na ... issues/575
I don't know yet how the developers will proceed with this. They may add a patch to the next release of XI or make a new Core release, and include it in XI. In any case, we will update the changelog. Thank you!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: handler behavior has changed since upgrading to XI 5.5.3
I noticed yesterday from your "change log" site that 2 new releases were released (5.5.4 and 5.5.5). saw the following note for 5.5.4 and thought that it might have been related to this issue:
•Fixed issue with Nagios Core notifications during downtime -SW
However, we applied 5.5.5 and the problematic behavior described in this post originally is not fixed. Please advise.
•Fixed issue with Nagios Core notifications during downtime -SW
However, we applied 5.5.5 and the problematic behavior described in this post originally is not fixed. Please advise.
Re: handler behavior has changed since upgrading to XI 5.5.3
Do you have an updated ETA on a fix for this?
Thanks.
Thanks.
lmiltchev wrote:Sure, we can do that.Please let us know when a new Nagios Xi is released that fixes this issue.
Also, you can keep an eye on the change log here:
https://www.nagios.com/downloads/nagios-xi/change-log/
and here:
https://github.com/NagiosEnterprises/na ... issues/575
I don't know yet how the developers will proceed with this. They may add a patch to the next release of XI or make a new Core release, and include it in XI. In any case, we will update the changelog. Thank you!
Re: handler behavior has changed since upgrading to XI 5.5.3
The issue I posted on github (Issue #575) is not fixed yet. It is different than the issue with notifications during downtime.
Unfortunately, I don't have an ETA on the fix at this time.Fixed issue with Nagios Core notifications during downtime -SW
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: handler behavior has changed since upgrading to XI 5.5.3
Do you know if this was fixed in 5.5.6 ?
lmiltchev wrote:The issue I posted on github (Issue #575) is not fixed yet. It is different than the issue with notifications during downtime.Unfortunately, I don't have an ETA on the fix at this time.Fixed issue with Nagios Core notifications during downtime -SW