Hi,
Small question, how do you define threshold on monitoring values and when do you decide to send an alert. I'm trying to define my values so I dont get spammed by alerts but still being pro active to end users.
Most service test are for CPU, Memory and disk space.
And Citrix as a user environment.
Also running exchange and sql (mssql, msql, oracle).
Thank you.
Monitoring thresholds and alerts
Re: Monitoring thresholds and alerts
Notification Interval in the service or host definition inside the Configure (top menu) -> Core Config Manager will set the threshold for sending a new notification. Click on the Alert tab.
There's also First notification delay to delay sending out the first notification.
Also, most checks accept "critical" and "warning" thresholds with lower and upper bounds in the form of [LOWER]:[UPPER]. That is, if critical=1:5, then if it's outside the range of 1-5 inclusive (for example 0 or 7), it'll consider it "critical." When UPPER is empty, it assumed it's Infinity. Likewise, if LOWER is empty, it's assumed to be 0.
Most of the time, you can disable the critical and warning by simply not specifying them. But if you're dealing with a check script that has a default critical or warning threshold if not specified, you can use the value critical=0: to disable the check from going critical.
There's also First notification delay to delay sending out the first notification.
Also, most checks accept "critical" and "warning" thresholds with lower and upper bounds in the form of [LOWER]:[UPPER]. That is, if critical=1:5, then if it's outside the range of 1-5 inclusive (for example 0 or 7), it'll consider it "critical." When UPPER is empty, it assumed it's Infinity. Likewise, if LOWER is empty, it's assumed to be 0.
Most of the time, you can disable the critical and warning by simply not specifying them. But if you're dealing with a check script that has a default critical or warning threshold if not specified, you can use the value critical=0: to disable the check from going critical.
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.
Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
-
bramassendorp
- Posts: 28
- Joined: Sun Jun 09, 2019 3:16 am
Re: Monitoring thresholds and alerts
Hi,dchurch wrote:Notification Interval in the service or host definition inside the Configure (top menu) -> Core Config Manager will set the threshold for sending a new notification. Click on the Alert tab.
There's also First notification delay to delay sending out the first notification.
Also, most checks accept "critical" and "warning" thresholds with lower and upper bounds in the form of [LOWER]:[UPPER]. That is, if critical=1:5, then if it's outside the range of 1-5 inclusive (for example 0 or 7), it'll consider it "critical." When UPPER is empty, it assumed it's Infinity. Likewise, if LOWER is empty, it's assumed to be 0.
Most of the time, you can disable the critical and warning by simply not specifying them. But if you're dealing with a check script that has a default critical or warning threshold if not specified, you can use the value critical=0: to disable the check from going critical.
Thank you for reply'ing, but I was more interested in how other users are setting this up. For example, when do you exactly alert with CPU tests at 100% / 95% and how long should it be on that value, alerting at 1 minute, or after 5 minutes?
I'm looking for best practices and examples how other have set this up in Nagios.
Re: Monitoring thresholds and alerts
It takes some finesse and tuning, and ultimately knowledge about what constitutes an error. Every monitoring situation is different; CPU usage on a web server means something completely different from CPU usage on a Windows workstation, means something different from a server that runs CPU-intensive compiler jobs.
It can be useful to even turn off critical and warning thresholds for CPU usage, and just use the monitoring metric to cross-reference times when the server had an error or became unresponsive.
It can be useful to even turn off critical and warning thresholds for CPU usage, and just use the monitoring metric to cross-reference times when the server had an error or became unresponsive.
Perhaps you'll want to ask this question over on the Community Support Forum. If you have any specific questions about how to change your thresholds or notifications, I can help.bramassendorp wrote:I'm looking for best practices and examples how other have set this up in Nagios.
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.
Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.