Recovery does not send notification, part 2

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Bitflogger
Posts: 226
Joined: Mon Oct 16, 2017 9:24 am

Recovery does not send notification, part 2

Post by Bitflogger »

doitnumber_problem.png
Hello,

I'm running v5.6.7, on a 64-bit CentOS 7 VM.

I have a previous case titled "Recovery does not send notification".

I opened a github case on that: https://github.com/NagiosEnterprises/na ... issues/651

It looks like they think it was fixed in Nagios core 4.4.4, and closed the case.

According to the change log for v5.6.7, you are using Nagios core 4.4.5.

I have a new occurrence of the same problem. I tried submitting a passive "OK" result, but the monitor stayed soft.

Earl
You do not have the required permissions to view the files attached to this post.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Recovery does not send notification, part 2

Post by scottwilkerson »

If you are in a SOFT CRITICAL and send the OK it will be a soft recovery.

This can happen in 2 cases, one if you haven't reached a HARD state yet, OR if the HARD CRITICAL was for example 1/3 because the host was down.

This is normal and expected behavior.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Bitflogger
Posts: 226
Joined: Mon Oct 16, 2017 9:24 am

Re: Recovery does not send notification, part 2

Post by Bitflogger »

doitnumber_problem2.png
Hello,

There was a critical e-mail, no recovery e-mail.

Earl
You do not have the required permissions to view the files attached to this post.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Recovery does not send notification, part 2

Post by scottwilkerson »

What happened to the host after 1:24? This is important to understand be

Can you share the host and service definitions along with any underlying templates? (obfuscating any sensitive info)
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Bitflogger
Posts: 226
Joined: Mon Oct 16, 2017 9:24 am

Re: Recovery does not send notification, part 2

Post by Bitflogger »

doitnumber_problem3.png
I'm unsure of the first question, the server shows green for the service when I did the screen prints. You can see the duration in the attached screen print.

###############################################################################
#
# Hosts configuration file
#
# Created by: Nagios CCM 2.7.2
# Date: 2018-11-02 15:37:37
# Version: Nagios Core 4.x
#
# --- DO NOT EDIT THIS FILE BY HAND ---
# Nagios CCM will overwrite all manual settings during the next update if you
# would like to edit files manually, place them in the 'static' directory or
# import your configs into the CCM by placing them in the 'import' directory.
#
###############################################################################

define host {
host_name x*****.doit.wisc.edu
use windows-server
address x****.doit.wisc.edu
max_check_attempts 1
check_interval 5
retry_interval 1
active_checks_enabled 1
contact_groups se-windows,sm_alerts_monitoring
notification_period win_wed_0300
register 1
}

###############################################################################
#
# Hosts configuration file
#
# END OF FILE
#
###############################################################################

###############################################################################
#
# Services configuration file
#
# Created by: Nagios CCM 3.0.3
# Date: 2019-10-21 17:05:04
# Version: Nagios Core 4.x
#
# --- DO NOT EDIT THIS FILE BY HAND ---
# Nagios CCM will overwrite all manual settings during the next update if you
# would like to edit files manually, place them in the 'static' directory or
# import your configs into the CCM by placing them in the 'import' directory.
#
###############################################################################

define service {
host_name x*****.doit.wisc.edu
service_description w_vugen_doitnumber
use generic-vugen-service
servicegroups _vugen_messages
check_command check_nscp_vugen!read_vugen -a s=$_SERVICESCRIPT$!!!!!!!
check_interval 5
notification_interval 5760
notification_period win_wed_0300
notification_options w,c,r,
contact_groups 93fe6121d79bee3ee5e3ee4f55863056d0530bcded,ovo-ss-vugen
_alert_description Check doitnumber web page
_ci (ci:93fe6121d79bee3ee5e3ee4f55863056d0530bcded) (cdwh)
_script doitnumber
_support_info https://a***.fina****.doit.wisc.edu
register 1
}

define host {
name windows-server
hostgroups _windows
check_command check-host-alive!!!!!!!!
use generic-host
max_check_attempts 10
check_interval 5
retry_interval 1
active_checks_enabled 1
check_period 24x7
contact_groups se-windows
notification_period 24x7
notification_options d,r,s,
notifications_enabled 1
register 0
}

define host {
name generic-host
check_command check-host-alive!!!!!!!!
max_check_attempts 3
check_interval 5
retry_interval 1
active_checks_enabled 1
check_period 24x7
event_handler_enabled 1
flap_detection_enabled 0
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
notification_interval 120
notification_period 24x7
notification_options d,u,r,s,
notifications_enabled 1
register 0
}

define service {
name generic-vugen-service
use generic-service
max_check_attempts 3
retry_interval 5
process_perf_data 1
notification_interval 5760
notification_period 24x7
notification_options w,c,r,
notifications_enabled 1
register 0
}

define service {
name generic-service
service_description generic-service
is_volatile 0
max_check_attempts 3
check_interval 5
retry_interval 1
active_checks_enabled 1
passive_checks_enabled 1
check_period 24x7
parallelize_check 1
obsess_over_service 1
check_freshness 0
event_handler_enabled 1
flap_detection_enabled 0
process_perf_data 0
retain_status_information 1
retain_nonstatus_information 1
notification_interval 60
notification_period 24x7
notification_options w,c,u,r,
notifications_enabled 1
register 0
}

define timeperiod {
timeperiod_name win_wed_0300
alias Automatic Updates - Wednesday - 03:00
sunday 00:00-24:00
monday 00:00-24:00
tuesday 00:00-24:00
wednesday 00:00-02:55,03:40-24:00
thursday 00:00-24:00
friday 00:00-24:00
saturday 00:00-24:00
}

define timeperiod {
timeperiod_name 24x7
alias 24 Hours A Day,7 Days A Week
saturday 00:00-24:00
friday 00:00-24:00
thursday 00:00-24:00
wednesday 00:00-24:00
tuesday 00:00-24:00
monday 00:00-24:00
sunday 00:00-24:00
}
You do not have the required permissions to view the files attached to this post.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Recovery does not send notification, part 2

Post by scottwilkerson »

Bitflogger wrote:I'm unsure of the first question, the server shows green for the service when I did the screen prints. You can see the duration in the attached screen print.
I was referring to the host for this service, if you look at the state history for the host, did the host go down at all?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Bitflogger
Posts: 226
Joined: Mon Oct 16, 2017 9:24 am

Re: Recovery does not send notification, part 2

Post by Bitflogger »

Hello,

We are checking a web page, using an intermediate server that makes many similar checks of other web pages.

We had a firewall fail-over and reset last night.

The Nagios XI server has been up 29 days:

uptime
14:21:59 up 29 days, 10:09, 6 users, load average: 1.46, 1.58, 1.64

Earl
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Recovery does not send notification, part 2

Post by scottwilkerson »

scottwilkerson wrote:I was referring to the host for this service, if you look at the state history for the host, did the host go down at all?
If you look at the state history for x*****.doit.wisc.edu, did the host go down at all during this time?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Bitflogger
Posts: 226
Joined: Mon Oct 16, 2017 9:24 am

Re: Recovery does not send notification, part 2

Post by Bitflogger »

Hello,

I don't see the significance of that, since the state change is on the NagiosXI server.

The x**** server uptime probe reports: uptime: 4w 2d 10:8h, boot: 2019-10-09 08:38:34 (UTC)

Earl
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Recovery does not send notification, part 2

Post by scottwilkerson »

Bitflogger wrote:I don't see the significance of that, since the state change is on the NagiosXI server.
The significance is that Nagios doesn't sent notifications for services when a host is in a non-UP state.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked