I am monitoring a few network devices, which have a large number of interfaces. I have the device modeled such that each interface is a separate service (since we only want to monitor interfaces that are in the admin up and ifoperstatus up state).
sporadically I'll get a flood of alerts, which I think are just due to switch load, where the switch stops answering snmp for a few minutes (since snmp is a lower priority task for the switch, if it spikes load, snmp will get dropped).
is there a way for me to rate limit alerts for a specific host, so that I can say something like "dont have this host send more than X service alerts within a specific period of time"?
thanks,
-Luke
service alert suppression question
Re: service alert suppression question
Unfortunately that functionality isn't available. I would probably go the route of increasing the retry interval or max check attempts value in this case to avoid false positives. Another option would be to use a different check that actually sends traffic through the port instead of querying the lower priority snmp process.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.