Hey guys,
In our Environment we have some Snmp traps which will be transmit from an ow-server over the snmptt deamon to nagios with sumbit check result script.
So far so good, but the traps comes very seldom and we want to renotify for the trap every 5 minutes until the state changes to ok.
So some of my ( may not so correct) considerations:
1. the option notification interval does not fit cause the state doesn't change in the notification intervall
2. Set an checkdummy check by an freshness threshold at the service ( but the state will be ignored )
does anyone have a clue or idea ?
thnx for help
skyerjoe
Need some help with snmp trap notifications
Need some help with snmp trap notifications
Nagios Core 3.5.1
Checkmk 1.2.4p5
Checkmk 1.2.4p5
Re: Need some help with snmp trap notifications
Is this because the trap is not changing the state of the passive trap service check object? If the state changes when the trap is received, the notification interval should work correctly (re-notify on interval until recovery or acknowledgement).skyerjoe wrote: 1. the option notification interval does not fit cause the state doesn't change in the notification intervall
Could you please clarify your query?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Need some help with snmp trap notifications
Hello abrist,
First of all Thanks for helping me !!
The check is for an water alert so here is a short explanation: water sensor critical -> send a signal to -> ow-server -> sends one trap id to -> snmptt -> sends over exec submit check result command -> nagios
And if the state goes to OK, the ow-server sends an another trap for the ok state.
But the notifiy interval doesn't work
Here is the service:
Here are the templates:
snmp-traps:
generic-service:
And here is the check in the objects.cache:
i Have also checked the nagios.cfg standard intervall length settings ( 60).
I don't see any collisions from my point of view ( maybe i overlooked something ).
regards skyerjoe
First of all Thanks for helping me !!
No the Trap comes over snmptt and it will be forward over exec command which triggers the submit result script to an passive check and than the state in nagios wil be changed.Is this because the trap is not changing the state of the passive trap service check object? If the state changes when the trap is received, the notification interval should work correctly (re-notify on interval until recovery or acknowledgement).
Could you please clarify your query?
The check is for an water alert so here is a short explanation: water sensor critical -> send a signal to -> ow-server -> sends one trap id to -> snmptt -> sends over exec submit check result command -> nagios
And if the state goes to OK, the ow-server sends an another trap for the ok state.
But the notifiy interval doesn't work
Here is the service:
Code: Select all
define service {
host_name ow-server
service_description Water Alert level 2
check_period 24x7
check_command check-host-alive
contact_groups admins
notification_period 24x7
initial_state o
check_interval 1.000000
retry_interval 1.000000
max_check_attempts 1
is_volatile 1
parallelize_check 1
active_checks_enabled 0
passive_checks_enabled 1
obsess_over_service 1
event_handler_enabled 1
low_flap_threshold 0.000000
high_flap_threshold 0.000000
flap_detection_enabled 1
flap_detection_options o,w,u,c
freshness_threshold 0
check_freshness 0
notification_options u,w,c
notifications_enabled 1
notification_interval 5.000000
first_notification_delay 0.000000
stalking_options n
process_perf_data 1
failure_prediction_enabled 1
retain_status_information 1
retain_nonstatus_information 1
}
Here are the templates:
snmp-traps:
Code: Select all
define service{
name snmp-traps
use generic-service
is_volatile 1
max_check_attempts 1
check_command check-host-alive
normal_check_interval 1
retry_check_interval 1
active_checks_enabled 0
passive_checks_enabled 1
notification_interval 31536000
notification_options w,u,c
register 0
}
generic-service:
Code: Select all
define service{
name generic-service ; The 'name' of this service template
active_checks_enabled 1 ; Active service checks are enabled
passive_checks_enabled 1 ; Passive service checks are enabled/accepted
parallelize_check 1 ; Active service checks should be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 ; We should obsess over this service (if necessary)
check_freshness 0 ; Default is to NOT check service 'freshness'
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
failure_prediction_enabled 1 ; Failure prediction is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
is_volatile 0 ; The service is not volatile
check_period 24x7 ; The service can be checked at any time of the day
max_check_attempts 3 ; Re-check the service up to 3 times in order to determine its final (hard) state
normal_check_interval 10 ; Check the service every 10 minutes under normal conditions
retry_check_interval 2 ; Re-check the service every two minutes until a hard state can be determined
contact_groups admins ; Notifications get sent out to everyone in the 'admins' group
notification_options w,u,c,r ; Send notifications about warning, unknown, critical, and recovery events
notification_interval 60 ; Re-notify about service problems every hour
notification_period 24x7 ; Notifications can be sent out at any time
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
}
And here is the check in the objects.cache:
Code: Select all
define service {
host_name ow-server
service_description Water Alert level 2
check_period 24x7
check_command check-host-alive
contact_groups admins
notification_period 24x7
initial_state o
check_interval 1.000000
retry_interval 1.000000
max_check_attempts 1
is_volatile 1
parallelize_check 1
active_checks_enabled 0
passive_checks_enabled 1
obsess_over_service 1
event_handler_enabled 1
low_flap_threshold 0.000000
high_flap_threshold 0.000000
flap_detection_enabled 1
flap_detection_options o,w,u,c
freshness_threshold 0
check_freshness 0
notification_options u,w,c
notifications_enabled 1
notification_interval 5.000000
first_notification_delay 0.000000
stalking_options n
process_perf_data 1
failure_prediction_enabled 1
retain_status_information 1
retain_nonstatus_information 1
}
i Have also checked the nagios.cfg standard intervall length settings ( 60).
I don't see any collisions from my point of view ( maybe i overlooked something ).
regards skyerjoe
Nagios Core 3.5.1
Checkmk 1.2.4p5
Checkmk 1.2.4p5
Re: Need some help with snmp trap notifications
What specifically is not working? Are you not receiving notifications? Are they coming too frequently or too slowly? Are they only sending once and not repeating?
Former Nagios employee
Re: Need some help with snmp trap notifications
HelloWhat specifically is not working? Are you not receiving notifications? Are they coming too frequently or too slowly? Are they only sending once and not repeating?
my Problem is, that i always get only one notfiication from an snmp trap .. and it will be very helpful if the state is already in "critical" , nagios should send renotifications in an Intervall for ex. 5 minutes
and if the state changes to "OK" no more notifications should be sent.
so thats my Problem
regards skyerjoe
Nagios Core 3.5.1
Checkmk 1.2.4p5
Checkmk 1.2.4p5
Re: Need some help with snmp trap notifications
do i missed some basic settings ?
regards skyerjoe
regards skyerjoe
Nagios Core 3.5.1
Checkmk 1.2.4p5
Checkmk 1.2.4p5
Re: Need some help with snmp trap notifications
Basically, you can only get one alert per passive result unless you do some fancy escalations. This article explains it fairly well:
http://stackoverflow.com/questions/2040 ... ive-checks
http://stackoverflow.com/questions/2040 ... ive-checks
Former Nagios employee