Page 1 of 1

Alert threshold times

Posted: Tue Aug 18, 2015 1:12 pm
by CalumH93
Hi,

Im hoping someone can point me in the right direction.

Im looking to have nagios only alert if a threshold has been reached for x amount of time. For example, CPU usage at 80% should alert as OK for the first 2 hours (while still being checked every 15 minutes or so), if after 2 hours it is still at 80% or higher, it should then alert as warning/critical.

This is to cut down on the amount of false issues being raised.

I have tried playing about with notification periods, check_intervals, retry_intervals and it doesn't quite do what I'm hoping it will do. I can get it so it doesn't send an email alert for the right time (for example not sending an email until the check has failed x amount of times), however it will still show on the nagios web monitoring page as a warning/critical alert which is the main thing I am trying to avoid.

I've hoping somthing can be added to the check argument or definition, however after a week of researching I'm still no where.

If anyone can point me where to start I would be greatful!

Thanks :)

Re: Alert threshold times

Posted: Tue Aug 18, 2015 2:53 pm
by jdalrymple
CalumH93 wrote:I can get it so it doesn't send an email alert for the right time (for example not sending an email until the check has failed x amount of times), however it will still show on the nagios web monitoring page as a warning/critical alert which is the main thing I am trying to avoid.
The UI updates on SOFT states, that's just how it is. If you know any C that could probably be modified fairly easily to suit your needs. I am not aware of any runtime configuration to change that behavior.