Need some help with snmp trap notifications

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
skyerjoe
Posts: 56
Joined: Thu Mar 22, 2012 2:57 am

Need some help with snmp trap notifications

Post by skyerjoe »

Hey guys,

In our Environment we have some Snmp traps which will be transmit from an ow-server over the snmptt deamon to nagios with sumbit check result script.

So far so good, but the traps comes very seldom and we want to renotify for the trap every 5 minutes until the state changes to ok.

So some of my ( may not so correct) considerations:

1. the option notification interval does not fit cause the state doesn't change in the notification intervall

2. Set an checkdummy check by an freshness threshold at the service ( but the state will be ignored )

does anyone have a clue or idea ?


thnx for help


skyerjoe
Nagios Core 3.5.1

Checkmk 1.2.4p5
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Need some help with snmp trap notifications

Post by abrist »

skyerjoe wrote: 1. the option notification interval does not fit cause the state doesn't change in the notification intervall
Is this because the trap is not changing the state of the passive trap service check object? If the state changes when the trap is received, the notification interval should work correctly (re-notify on interval until recovery or acknowledgement).
Could you please clarify your query?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
skyerjoe
Posts: 56
Joined: Thu Mar 22, 2012 2:57 am

Re: Need some help with snmp trap notifications

Post by skyerjoe »

Hello abrist,

First of all Thanks for helping me !!
Is this because the trap is not changing the state of the passive trap service check object? If the state changes when the trap is received, the notification interval should work correctly (re-notify on interval until recovery or acknowledgement).
Could you please clarify your query?
No the Trap comes over snmptt and it will be forward over exec command which triggers the submit result script to an passive check and than the state in nagios wil be changed.

The check is for an water alert so here is a short explanation: water sensor critical -> send a signal to -> ow-server -> sends one trap id to -> snmptt -> sends over exec submit check result command -> nagios

And if the state goes to OK, the ow-server sends an another trap for the ok state.

But the notifiy interval doesn't work

Here is the service:

Code: Select all

define service {
        host_name       ow-server
        service_description     Water Alert level 2
        check_period    24x7
        check_command   check-host-alive
        contact_groups  admins
        notification_period     24x7
        initial_state   o
        check_interval  1.000000
        retry_interval  1.000000
        max_check_attempts      1
        is_volatile     1
        parallelize_check       1
        active_checks_enabled   0
        passive_checks_enabled  1
        obsess_over_service     1
        event_handler_enabled   1
        low_flap_threshold      0.000000
        high_flap_threshold     0.000000
        flap_detection_enabled  1
        flap_detection_options  o,w,u,c
        freshness_threshold     0
        check_freshness 0
        notification_options    u,w,c
        notifications_enabled   1
        notification_interval   5.000000  
        first_notification_delay        0.000000
        stalking_options        n
        process_perf_data       1
        failure_prediction_enabled      1
        retain_status_information       1
        retain_nonstatus_information    1
        }

Here are the templates:

snmp-traps:

Code: Select all

define service{
   name                    snmp-traps
   use                     generic-service
   is_volatile             1
   max_check_attempts      1
   check_command           check-host-alive
   normal_check_interval   1
   retry_check_interval    1
   active_checks_enabled   0
   passive_checks_enabled  1
   notification_interval   31536000
   notification_options    w,u,c
   register                0
}

generic-service:

Code: Select all

define service{
        name                            generic-service         ; The 'name' of this service template
        active_checks_enabled           1                       ; Active service checks are enabled
        passive_checks_enabled          1                       ; Passive service checks are enabled/accepted
        parallelize_check               1                       ; Active service checks should be parallelized (disabling this can lead to major performance problems)
        obsess_over_service             1                       ; We should obsess over this service (if necessary)
        check_freshness                 0                       ; Default is to NOT check service 'freshness'
        notifications_enabled           1                       ; Service notifications are enabled
        event_handler_enabled           1                       ; Service event handler is enabled
        flap_detection_enabled          1                       ; Flap detection is enabled
        failure_prediction_enabled      1                       ; Failure prediction is enabled
        process_perf_data               1                       ; Process performance data
        retain_status_information       1                       ; Retain status information across program restarts
        retain_nonstatus_information    1                       ; Retain non-status information across program restarts
        is_volatile                     0                       ; The service is not volatile
        check_period                    24x7                    ; The service can be checked at any time of the day
        max_check_attempts              3                       ; Re-check the service up to 3 times in order to determine its final (hard) state
        normal_check_interval           10                      ; Check the service every 10 minutes under normal conditions
        retry_check_interval            2                       ; Re-check the service every two minutes until a hard state can be determined
        contact_groups                  admins                  ; Notifications get sent out to everyone in the 'admins' group
        notification_options            w,u,c,r                 ; Send notifications about warning, unknown, critical, and recovery events
        notification_interval           60                      ; Re-notify about service problems every hour
        notification_period             24x7                    ; Notifications can be sent out at any time
         register                        0                      ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
        }

And here is the check in the objects.cache:

Code: Select all

define service {
        host_name       ow-server
        service_description     Water Alert level 2
        check_period    24x7
        check_command   check-host-alive
        contact_groups  admins
        notification_period     24x7
        initial_state   o
        check_interval  1.000000
        retry_interval  1.000000
        max_check_attempts      1
        is_volatile     1
        parallelize_check       1
        active_checks_enabled   0
        passive_checks_enabled  1
        obsess_over_service     1
        event_handler_enabled   1
        low_flap_threshold      0.000000
        high_flap_threshold     0.000000
        flap_detection_enabled  1
        flap_detection_options  o,w,u,c
        freshness_threshold     0
        check_freshness 0
        notification_options    u,w,c
        notifications_enabled   1
        notification_interval   5.000000  
        first_notification_delay        0.000000
        stalking_options        n
        process_perf_data       1
        failure_prediction_enabled      1
        retain_status_information       1
        retain_nonstatus_information    1
        }



i Have also checked the nagios.cfg standard intervall length settings ( 60).


I don't see any collisions from my point of view ( maybe i overlooked something ).


regards skyerjoe
Nagios Core 3.5.1

Checkmk 1.2.4p5
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Need some help with snmp trap notifications

Post by tmcdonald »

What specifically is not working? Are you not receiving notifications? Are they coming too frequently or too slowly? Are they only sending once and not repeating?
Former Nagios employee
skyerjoe
Posts: 56
Joined: Thu Mar 22, 2012 2:57 am

Re: Need some help with snmp trap notifications

Post by skyerjoe »

What specifically is not working? Are you not receiving notifications? Are they coming too frequently or too slowly? Are they only sending once and not repeating?
Hello

my Problem is, that i always get only one notfiication from an snmp trap .. and it will be very helpful if the state is already in "critical" , nagios should send renotifications in an Intervall for ex. 5 minutes

and if the state changes to "OK" no more notifications should be sent.

so thats my Problem

regards skyerjoe
Nagios Core 3.5.1

Checkmk 1.2.4p5
skyerjoe
Posts: 56
Joined: Thu Mar 22, 2012 2:57 am

Re: Need some help with snmp trap notifications

Post by skyerjoe »

do i missed some basic settings ?

regards skyerjoe
Nagios Core 3.5.1

Checkmk 1.2.4p5
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Need some help with snmp trap notifications

Post by tmcdonald »

Basically, you can only get one alert per passive result unless you do some fancy escalations. This article explains it fairly well:

http://stackoverflow.com/questions/2040 ... ive-checks
Former Nagios employee
Locked