NRPE Disk Check sending out Recovery alerts for no reason

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

NRPE Disk Check sending out Recovery alerts for no reason

Post by rferebee »

Hello,

We've been receiving notifications from one of our Nagios XI alert recipients that some of their Linux servers are sending out RECOVERY Service Alert notifications for Disk Checks randomly.

We cannot find anything in XI that would be causing these alerts to be sent out since the service never goes CRITICAL. There shouldn't be anything for the service to recover from as far as we can tell.

Is there anything else I can look at to figure out what is causing these RECOVERY messages to be sent out?

Thank you.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: NRPE Disk Check sending out Recovery alerts for no reaso

Post by scottwilkerson »

What version of XI are you running?

There was a bug prior to 5.5.7 that could cause this behavior
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: NRPE Disk Check sending out Recovery alerts for no reaso

Post by rferebee »

We just upgraded to 5.5.7 on Tuesday and it occurred again yesterday (Wednesday). We were having the issue prior to the upgrade.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: NRPE Disk Check sending out Recovery alerts for no reaso

Post by scottwilkerson »

rferebee wrote:We just upgraded to 5.5.7 on Tuesday and it occurred again yesterday (Wednesday). We were having the issue prior to the upgrade.
The issue was caused by the notification number not being reset when hosts/services go back into an OK state.

It is possible that you could still get a few recoveries until all of them have cycled.

The only way around this would be to stop Nagios, remove the retention.dat file and restart nagios. This does however have a side affect of losing potential flapping data and comment history.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: NRPE Disk Check sending out Recovery alerts for no reaso

Post by rferebee »

Thank you, you can lock this thread.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: NRPE Disk Check sending out Recovery alerts for no reaso

Post by scottwilkerson »

rferebee wrote:Thank you, you can lock this thread.
Great!

Locking
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked