Hello Nagios Support Team,
We are currently experiencing a large number of false positive alerts in our Nagios XI environment and would appreciate your assistance in analyzing and resolving this issue.
Issue Description:
Nagios XI is triggering alerts (WARNING/CRITICAL) for several hosts and services, even though manual checks show they are operating normally.
The problem seems to occur intermittently, and in many cases, the next check returns to an OK state without any changes to the system.
This is resulting in alert fatigue and reducing the reliability of the monitoring system.
Environment Details:
Nagios XI Version: [2024R1.4.1]
Operating System: [RHEL 9]
Check Intervals: Check: 5 min, Retry: 1 min, Max attempts: [5]
Notable Observation: Sometimes the plugin execution time is longer than expected (e.g., >10 seconds)
Request:
Could you help us:
Analyze the possible root causes for these intermittent false positives in Nagios XI, especially for check_icmp.
I can provide detailed logs, plugin debug outputs, and screenshots if required.
Trigger alerts host problem
Re: Trigger alerts host problem
Chances are the alerts are valid but the issue clears by the time testing is done.
You need to provide the service, command and text of the alerts.
Is the perceived issue alerts or notifications? Alerts just change the state of the object; notifications are typically email or text.
You need to provide the service, command and text of the alerts.
Is the perceived issue alerts or notifications? Alerts just change the state of the object; notifications are typically email or text.
Re: Trigger alerts host problem
Hi, Would you plesae proivde a email to discuss the case. We have some screen capture to share.
Thanks.
Thanks.
Re: Trigger alerts host problem
No.
Read the first announcement at the top of the page.
Read the first announcement at the top of the page.
Re: Trigger alerts host problem
Hi @vtcedu,
Per @kg2857's advice, more information is needed to troubleshoot the issues you're having with false positives. If you're seeing alerts from services that appear to be up when you check them, it's possible that the fix for this is adjusting your retry interval and max attempts so that your checks only trigger alerts if the issue is sustained enough to be considered a real issue for your environment.
If you need more directed support with sensitive details, you'll most likely need to open a support case. You can do that here.
Per @kg2857's advice, more information is needed to troubleshoot the issues you're having with false positives. If you're seeing alerts from services that appear to be up when you check them, it's possible that the fix for this is adjusting your retry interval and max attempts so that your checks only trigger alerts if the issue is sustained enough to be considered a real issue for your environment.
If you need more directed support with sensitive details, you'll most likely need to open a support case. You can do that here.