I was hoping this would be handled by stalking, volatility, or obsessiveness, but none seem quite right after more reading...
Notifications are fine w/ the 15 minute delay and 60 minute reminders, but I'd like the trap sender to also send reminders, perhaps at a shorter interval.
Why, you ask. A single trap for an alert and a single ok trap can get lost in the network. Alerts/oks in downtime can get lost. There are also issues w/ alerts that change from critical to warning or the other way around.
If an alert is sent before downtime begins and the ok happens in downtime. The alert is never cancelled in the next system. The same is true if an alert begins in downtime--the downstream system never knows because the single trap was blocked by downtime.
My thought was to forget oks, and to send a stream of alerts (perhaps at the notification reminder interval) until the alert clears. Once status is ok, after a number of monitoring intervals, the next system would clear the alert.
Thoughts? Solutions?
I'd like the trap sender to send alert traps continually
Re: I'd like the trap sender to send alert traps continually
One of the problems I envision is if a trap gets sent and then a recovery doesn't get sent (for whatever reason, nagios crash, etc) then it would be stuck in the queue and continually alert because there wouldn't be another recovery that get's sent until it went down again and then recovered again.
Essentially you could have them written to a spool file and have a cron job resend at a specified interval, then you would either have to wait for the recovery to show up AND/OR check the status of the service to see if it's OK to prevent it from being sent out forever. I suppose you could parse the spool file to determine hostname/servicename and then utilize the new API to check host/service status.
I don't think you'd really want to just adjust the trap sender component to run the checks/resends because it would only run every time a trap came through (which we would have no idea when that would be).
You could also modify the notifications script to also send the traps (along with the notifications) but that would require quite a bit more coding.
I don't think there is a solution that doesn't involve custom development at this point. We do offer custom development if that is something you're interested in, you can contact [email protected] to get more information.
I could always submit a feature request for it if you'd like?
Essentially you could have them written to a spool file and have a cron job resend at a specified interval, then you would either have to wait for the recovery to show up AND/OR check the status of the service to see if it's OK to prevent it from being sent out forever. I suppose you could parse the spool file to determine hostname/servicename and then utilize the new API to check host/service status.
I don't think you'd really want to just adjust the trap sender component to run the checks/resends because it would only run every time a trap came through (which we would have no idea when that would be).
You could also modify the notifications script to also send the traps (along with the notifications) but that would require quite a bit more coding.
I don't think there is a solution that doesn't involve custom development at this point. We do offer custom development if that is something you're interested in, you can contact [email protected] to get more information.
I could always submit a feature request for it if you'd like?
Re: I'd like the trap sender to send alert traps continually
The problem you describe in the 1st line is exactly the problem that exists now I'm trying to resolve. Saying the solution will cause the problem that exists is somewhat nonsensical.
The trap sender needs to be better integrated to notifications since Nagios already handles notifications just as trap sending should be handled, or rather trap sending should have the option to be a notification and work the same.
The trap sender needs to be better integrated to notifications since Nagios already handles notifications just as trap sending should be handled, or rather trap sending should have the option to be a notification and work the same.
Re: I'd like the trap sender to send alert traps continually
I think Sean got to the crux of the issue here:
The closest we can get to a reminder is similar to the notification interval, but as described I don't think there is a one- or two-click solution that would offer this. Sean, myself, or another member of the team can make this FR if you want, but first let us know if you'd like to go that route or discuss custom dev.ssax wrote:I don't think there is a solution that doesn't involve custom development at this point. We do offer custom development if that is something you're interested in, you can contact [email protected] to get more information.
I could always submit a feature request for it if you'd like?
Former Nagios employee
Re: I'd like the trap sender to send alert traps continually
Yes, create a feature request. I'll add it to the trap sender I'm creating too.
Ask for logging as well, since I'm asked daily why traps weren't sent when they were...
You can close this.
Thanks
Ask for logging as well, since I'm asked daily why traps weren't sent when they were...
You can close this.
Thanks
Re: I'd like the trap sender to send alert traps continually
Would you like fries with that, too? :)
I'll have it filed and this thread updated with details. Just for simplicity's sake since there were quite a few details in your post, is the following basically what you want?
I'll have it filed and this thread updated with details. Just for simplicity's sake since there were quite a few details in your post, is the following basically what you want?
FR: Trap sender should have a "reverse-freshness" setting where it sends another trap if it has not in X minutes
Former Nagios employee
Re: I'd like the trap sender to send alert traps continually
Yes please!
The trap sender currently sends a single trap for an alert, and a single trap for an ok so "if it has not in X minutes" isn't meaningful...
The trap sender currently sends a single trap for an alert, and a single trap for an ok so "if it has not in X minutes" isn't meaningful...
Re: I'd like the trap sender to send alert traps continually
I guess I'm just not clear on what you need then. I can file a feature request on your behalf if you can describe for me the functionality you are looking for. I thought I had a good idea of what you wanted but in your last post you said my description was not meaningful so I want to make sure I'm making the right requests.
Former Nagios employee
Re: I'd like the trap sender to send alert traps continually
Your feature request implied that traps are sent repeatedly, which they are not.
I want traps to be configurable so that they are sent repeatedly, just as are reminder notifications. Open a service in CCM, click the alert settings tab, and look at the notification interval. I'd like that for traps.
Traps are sent over UDP by default, and a single trap can be lost. Also, as previously explained in great detail, traps aren't sent during downtime, so the alert status can get out of sync between the trap receiver and Nagios. Resending alert traps will reduce the problem.
I want traps to be configurable so that they are sent repeatedly, just as are reminder notifications. Open a service in CCM, click the alert settings tab, and look at the notification interval. I'd like that for traps.
Traps are sent over UDP by default, and a single trap can be lost. Also, as previously explained in great detail, traps aren't sent during downtime, so the alert status can get out of sync between the trap receiver and Nagios. Resending alert traps will reduce the problem.
Re: I'd like the trap sender to send alert traps continually
Yea, I had that in mind but probably didn't word it properly.
However, depending on how the traps are being triggered to send I am not sure how complicated of a fix this might be. Downtime disables notifications, and if traps are notification-triggered we might not be able to get around this easily. I'll put in the request and let the devs handle the details, but I thought I would point this out.
The "if it has not in" part in my mind was an overly-verbose way of saying "every".FR: Trap sender should have a "notification_interval"-like setting where it re-sends trap every X minutes if there is still a problem. Downtime should not affect sending of traps.
However, depending on how the traps are being triggered to send I am not sure how complicated of a fix this might be. Downtime disables notifications, and if traps are notification-triggered we might not be able to get around this easily. I'll put in the request and let the devs handle the details, but I thought I would point this out.
Former Nagios employee