Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
I have a setup where nagios receives a snmp trap from a device. It then notifies the contact defined in config.cfg. that works great. What I am trying to accomplish is have nagios send another notification if the problem isn't acknowledge in a given amount of time. I can not get nagios to send that second notification. I am using external commands to actually make a call as the notification, that all works fine. I don't see nagios attempt to make that second notification.
I cut down all my config files to 1 config file for easy of reading.
I would recommend enabling freshness checking on your passive services, that way, if an update is not received within a given amount of time "say 10 minutes or so" then it will trigger a command to run "usually check_dummy" to change the state to critical and trigger a notification.http://nagios.sourceforge.net/docs/3_0/freshness.html
As far as an acknowledgement? That could be tricky, I don't believe there is really a way to trigger a notification in the case a host/service is not acknowledged, besides running a freshness event handler to trigger a notification after no updates are received.
That is correct. One notification/alert when it goes down, if it's down for a week you still get only one, then if it recovers you may get another if you have configured it to alert on recovery.
We should clarify, it is standard to only get one alert at the time of a state change, however the notification interval config option on hosts and services denotes the time frame to send another notification if the host\service is not acknowledged or a check returns it to an ok state. There shouldn't be a need to use escalations unless you want to alter that behavior. Just using notification interval should do it.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Yes. Problem state passive checks will still re-notify on interval. By default they are treated as hard states, so a passive check that changes the state of an object will not wait for retries and move immediately to notifications (if configured). Like active checks, they will continue to notify on interval until resolved or acknowledged..
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
this tells me there should be another notification 60 seconds later....
but that next notification never happens.
logs show nothing after 1st notification.
Unless the host also went down at the same time, thus causing the service to stop notifying since it's "parent" host is down, I see no reason that this should not have notified again. I would note that 1 minute notification intervals are probably a little bit quick, for testing maybe set it to 5 min just to be sure its not still executing the previous one. Also which log do you happen to be looking at when you are seeing the first message but not the second?
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.