Page 1 of 1

check interval and alerting

Posted: Wed Aug 08, 2018 7:50 am
by DFaught
When CPU utilization is checked every 5 minutes, for example, and I have a certain process on a certain server that I know runs for 30 minutes at 100% CPU and that is okay, in other words I don't want an alert for this particular situation, how should that service be configured? An associate suggested changing the check interval to 30 minutes, but I'm thinking that is not a good approach. Is this what the "First Notification Delay" is used for? What is the best way to do this?

Re: check interval and alerting

Posted: Wed Aug 08, 2018 7:54 am
by scottwilkerson
DFaught wrote:Is this what the "First Notification Delay" is used for?
This is exactly correct.

You can set this to be something > 30 minutes, if the CPU drops before the delay is reached, no notifications will be sent.

Re: check interval and alerting

Posted: Wed Aug 08, 2018 12:44 pm
by rexconsulting
Another option would be to set max_check_attempts to 6 or 7.

I would also comment that this is why CPU monitoring is a lot less useful than Load Average (or on Windows, Process Queue Depth) for tracking impact to system resources.

Re: check interval and alerting

Posted: Wed Aug 08, 2018 1:35 pm
by scottwilkerson
rexconsulting wrote:Another option would be to set max_check_attempts to 6 or 7.

I would also comment that this is why CPU monitoring is a lot less useful than Load Average (or on Windows, Process Queue Depth) for tracking impact to system resources.
One problem with this is if you have a retry_interval of 1 minute, you would only be about 12 minutes out before notification.

You can really accomplish it several ways

Re: check interval and alerting

Posted: Thu Aug 09, 2018 3:32 pm
by rexconsulting
Yeah. Actually I really like notification_delay now. I hadn't really used it before. Thanks Scott!

Re: check interval and alerting

Posted: Thu Aug 09, 2018 3:34 pm
by scottwilkerson
@DFaught

Can we mark this as resolved?