check interval and alerting
check interval and alerting
When CPU utilization is checked every 5 minutes, for example, and I have a certain process on a certain server that I know runs for 30 minutes at 100% CPU and that is okay, in other words I don't want an alert for this particular situation, how should that service be configured? An associate suggested changing the check interval to 30 minutes, but I'm thinking that is not a good approach. Is this what the "First Notification Delay" is used for? What is the best way to do this?
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: check interval and alerting
This is exactly correct.DFaught wrote:Is this what the "First Notification Delay" is used for?
You can set this to be something > 30 minutes, if the CPU drops before the delay is reached, no notifications will be sent.
- rexconsulting
- Posts: 60
- Joined: Fri May 04, 2012 4:27 pm
- Location: Oakland, CA
- Contact:
Re: check interval and alerting
Another option would be to set max_check_attempts to 6 or 7.
I would also comment that this is why CPU monitoring is a lot less useful than Load Average (or on Windows, Process Queue Depth) for tracking impact to system resources.
I would also comment that this is why CPU monitoring is a lot less useful than Load Average (or on Windows, Process Queue Depth) for tracking impact to system resources.
CP
--
Chris Paul
Rex Consulting, Inc
5652 Florence Terrace, Oakland, CA 94611
email: [email protected]
web: http://www.rexconsulting.net
phone, toll-free: +1 (888) 403-8996 ext 1
--
Chris Paul
Rex Consulting, Inc
5652 Florence Terrace, Oakland, CA 94611
email: [email protected]
web: http://www.rexconsulting.net
phone, toll-free: +1 (888) 403-8996 ext 1
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: check interval and alerting
One problem with this is if you have a retry_interval of 1 minute, you would only be about 12 minutes out before notification.rexconsulting wrote:Another option would be to set max_check_attempts to 6 or 7.
I would also comment that this is why CPU monitoring is a lot less useful than Load Average (or on Windows, Process Queue Depth) for tracking impact to system resources.
You can really accomplish it several ways
- rexconsulting
- Posts: 60
- Joined: Fri May 04, 2012 4:27 pm
- Location: Oakland, CA
- Contact:
Re: check interval and alerting
Yeah. Actually I really like notification_delay now. I hadn't really used it before. Thanks Scott!
CP
--
Chris Paul
Rex Consulting, Inc
5652 Florence Terrace, Oakland, CA 94611
email: [email protected]
web: http://www.rexconsulting.net
phone, toll-free: +1 (888) 403-8996 ext 1
--
Chris Paul
Rex Consulting, Inc
5652 Florence Terrace, Oakland, CA 94611
email: [email protected]
web: http://www.rexconsulting.net
phone, toll-free: +1 (888) 403-8996 ext 1
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact: