We are using macros SERVICEPROBLEMID and LASTSERVICEPROBLEMID as parameter in our notification action script.
On older versions of Nagios, when a check went from CRITICAL to OK, the SERVICEPROBLEMID was 0 and the LASTSERVICEPROBLEMID was set to the SERVICEPROBLEMID associated to the issue.
In version 4.4.6 however, when a service goes from CRITICAL to OK, our script gets LASTSERVICEPROBLEMID as "0" and SERVICEPROBLEMID is set to the ProblemID.
I checked the sourcecode. In file base/checks.c around line 1620, the last_problem_id is set to the current_problem_id and current_problem_id is set to "0L". But since the notification is already done in line 1582, the notification script is called with the wrong values.
I would suggest to set the value of last_problem_id in subroutine service_state_or_hard_state_type_change, as it is done already for host state changes.
Here's my suggestion:
Code: Select all
*** checks.c 2020-04-28 22:48:29.000000000 +0200
--- checks.c.new 2021-10-06 11:15:05.569660662 +0200
***************
*** 892,897 ****
--- 892,904 ----
next_problem_id++;
}
+ /* clear the problem id when transitioning from a problem state to an OK state */
+ if(svc->current_state == STATE_OK) {
+ svc->last_problem_id = svc->current_problem_id;
+ svc->current_problem_id = 0L;
+ }
+
+
svc->state_type = SOFT_STATE;
state_or_type_change = TRUE;
***************
*** 1618,1625 ****
if (svc->current_state == STATE_OK && state_change == TRUE) {
/* Problem state starts regardless of SOFT/HARD status. */
! svc->last_problem_id = svc->current_problem_id;
! svc->current_problem_id = 0L;
/* Reset attempts */
if (hard_state_change == TRUE) {
--- 1625,1633 ----
if (svc->current_state == STATE_OK && state_change == TRUE) {
/* Problem state starts regardless of SOFT/HARD status. */
! /* Already set in service_state_or_hard_state_type_change */
! /* svc->last_problem_id = svc->current_problem_id; */
! /* svc->current_problem_id = 0L; */
/* Reset attempts */
if (hard_state_change == TRUE) {
Andreas