check interval and alerting

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
DFaught
Posts: 62
Joined: Tue Sep 26, 2017 12:50 pm

check interval and alerting

Post by DFaught »

When CPU utilization is checked every 5 minutes, for example, and I have a certain process on a certain server that I know runs for 30 minutes at 100% CPU and that is okay, in other words I don't want an alert for this particular situation, how should that service be configured? An associate suggested changing the check interval to 30 minutes, but I'm thinking that is not a good approach. Is this what the "First Notification Delay" is used for? What is the best way to do this?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: check interval and alerting

Post by scottwilkerson »

DFaught wrote:Is this what the "First Notification Delay" is used for?
This is exactly correct.

You can set this to be something > 30 minutes, if the CPU drops before the delay is reached, no notifications will be sent.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
User avatar
rexconsulting
Posts: 60
Joined: Fri May 04, 2012 4:27 pm
Location: Oakland, CA
Contact:

Re: check interval and alerting

Post by rexconsulting »

Another option would be to set max_check_attempts to 6 or 7.

I would also comment that this is why CPU monitoring is a lot less useful than Load Average (or on Windows, Process Queue Depth) for tracking impact to system resources.
CP
--
Chris Paul
Rex Consulting, Inc
5652 Florence Terrace, Oakland, CA 94611
email: [email protected]
web: http://www.rexconsulting.net
phone, toll-free: +1 (888) 403-8996 ext 1
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: check interval and alerting

Post by scottwilkerson »

rexconsulting wrote:Another option would be to set max_check_attempts to 6 or 7.

I would also comment that this is why CPU monitoring is a lot less useful than Load Average (or on Windows, Process Queue Depth) for tracking impact to system resources.
One problem with this is if you have a retry_interval of 1 minute, you would only be about 12 minutes out before notification.

You can really accomplish it several ways
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
User avatar
rexconsulting
Posts: 60
Joined: Fri May 04, 2012 4:27 pm
Location: Oakland, CA
Contact:

Re: check interval and alerting

Post by rexconsulting »

Yeah. Actually I really like notification_delay now. I hadn't really used it before. Thanks Scott!
CP
--
Chris Paul
Rex Consulting, Inc
5652 Florence Terrace, Oakland, CA 94611
email: [email protected]
web: http://www.rexconsulting.net
phone, toll-free: +1 (888) 403-8996 ext 1
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: check interval and alerting

Post by scottwilkerson »

@DFaught

Can we mark this as resolved?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked