Consistently inaccurate notifications

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
rgage_hhsc
Posts: 1
Joined: Fri Aug 13, 2021 1:52 pm

Consistently inaccurate notifications

Post by rgage_hhsc »

Our organization is using Nagios for monitoring several servers on several criteria -- one in particular has picked up a very confusing pattern.

The alert being triggered is a CPU Usage limit -- what happens is a nightly maintenance task that pumps CPU usage for 10-20 minutes.
Nagios' usage graph very accurately portrays the situation:
Nagios CPU Usage graph
Nagios CPU Usage graph
What Nagios reports in its history is even more detailed, with six events every day in that spike-time:
Nagios 1-day History
Nagios 1-day History
But I consistently get THE FIRST THREE notifications about this spike every day, and nothing else: WARNING, RECOVERY, WARNING, all within a few minutes of each other; then nothing until the next day when it happens again.

Judging solely by my emails from Nagios, there is a few seconds of recovery time each day amidst a CPU Warning event that has been happening for months. Looking at the graph up there, this obviously is a false picture.

This is not a critical worry, we know it's just one spike despite what Nagios' emails are telling us … but a bug's a bug, and Nagios can't be fixed unless it's reported. So, consider this reported. :-)

Thanks!
boB
Locked