Folks, we are currently running a trial of Nagios XI Server, and Nagios Agent on our lab fault management platform. Out intent here is to integrate Nagios
fault management with our platforms. The integration point would be Nagios events, in the form of SNMP traps, sent from Nagios, through the Nagios
SNMP Trap Senter, to our trap collector. Nagios has discovered Unix components on our managed host, and does seem to be generating events as expected. One thing
which we do question is the data being passed in the var binds of the Natios SNMP nSvcEvent trap ( .1.3.6.1.4.1.20006.1.7). Nagios traps being generated. We show some samples of these below.
While studying the trap output we have noticed that the varbinds generated by Nagios don’t quite seem to us to match up with the varbind definitions shown in the mib. First, the nSvcEvent trap definition shows ten varbinds, but we are seeing only six in our log. Also, some of the varbind values types don’t seem to match up with the definitions defined in the mib. Text is shown in some integer fields. We did look through documentation we could find pertaining to the trap sender, and notifications in general. We understand that custom notifications can be made to alter what field information is sent on events. At this point we have not tried any of this. Initially we wanted to trial out-of-the-box Nagios event processing and trap flow.
We were hoping to understand exactly how/when various varbind information is sent your different Nagios traps send via the Nagios SNMP Trap Sender.
Any information you could provide would be greatly appreciated.
Take care,
Russ Heaton
ATT
Formatted traps from NNM trapd.log
1424885124 1 Wed Feb 25 17:25:24 2015 x.x.x.x - nSvcEvent. nHostname: x.x.x.x; nHostStateID: Filespace_logs; nSvcDesc: 2; nSvcStateID: DISK CRITICAL - free space: /logs 474 MB (4% inode=99%):; nSvcAttempt: (UNAVAILABLE EVENT PARAMETER $5); nSvcDurationSec: (UNAVAILABLE EVENT PARAMETER $6); nSvcGroupName: (UNAVAILABLE EVENT PARAMETER $7); nSvcLastCheck: (UNAVAILABLE EVENT PARAMETER $8); nSvcLastChange: (UNAVAILABLE EVENT PARAMETER $9); nSvcOutput: (UNAVAILABLE EVENT PARAMETER $10);1 .1.3.6.1.4.1.20006.1.7 0
1424886092 1 Wed Feb 25 17:41:32 2015 x.x.x.x - nSvcEvent. nHostname: x.x.x.x; nHostStateID: brcdSlbVSPortStatsCurrentConnection; nSvcDesc: 2; nSvcStateID: Current Connections CRITICAL - *106* Connections; nSvcAttempt: (UNAVAILABLE EVENT PARAMETER $5); nSvcDurationSec: (UNAVAILABLE EVENT PARAMETER $6); nSvcGroupName: (UNAVAILABLE EVENT PARAMETER $7); nSvcLastCheck: (UNAVAILABLE EVENT PARAMETER $8); nSvcLastChange: (UNAVAILABLE EVENT PARAMETER $9); nSvcOutput: (UNAVAILABLE EVENT PARAMETER $10);1 .1.3.6.1.4.1.20006.1.7 0
1424886116 1 Wed Feb 25 17:41:56 2015 x.x.x.x - nSvcEvent. nHostname: x.x.x.x; nHostStateID: ntpd; nSvcDesc: 2; nSvcStateID: *** ntpd: Nok ***; nSvcAttempt: (UNAVAILABLE EVENT PARAMETER $5); nSvcDurationSec: (UNAVAILABLE EVENT PARAMETER $6); nSvcGroupName: (UNAVAILABLE EVENT PARAMETER $7); nSvcLastCheck: (UNAVAILABLE EVENT PARAMETER $8); nSvcLastChange: (UNAVAILABLE EVENT PARAMETER $9); nSvcOutput: (UNAVAILABLE EVENT PARAMETER $10);1 .1.3.6.1.4.1.20006.1.7 0
Trapd.conf Definition
EVENT nSvcEvent .1.3.6.1.4.1.20006.1.7 "LOGONLY" Normal
FORMAT nSvcEvent. nHostname: $1; nHostStateID: $2; nSvcDesc: $3; nSvcStateID: $4; nSvcAttempt: $5; nSvcDurationSec: $6; nSvcGroupName: $7; nSvcLastCheck: $8; nSvcLastChange: $9; nSvcOutput: $10
SDESC
Long Descr.:
"The SNMP trap that is generated as a result of an event with the service
in Nagios."
Variables:
1: nHostname
Syntax="Octet String"
Descr="Hostname as specified in the Nagios configuration file."
2: nHostStateID
Syntax="HostStateID (Integer) "
Descr="The host state as defined by the HOSTSTATEID macro"
3: nSvcDesc
Syntax="Octet String"
Descr="This value is taken from the description directive of the service
definition."
4: nSvcStateID
Syntax="ServiceStateID (Integer) "
Descr=" A number that corresponds to the current state of the service: 0=OK,
1=WARNING, 2=CRITICAL, 3=UNKNOWN"
5: nSvcAttempt
Syntax="Integer"
Descr="The number of the current service check retry. For instance, if this is
the second time that the service is being rechecked, this will be the
number two. Current attempt number is really only useful when writing
service event handlers for soft states that take a specific action based
on the service retry number."
6: nSvcDurationSec
Syntax="Integer"
Descr="A number indicating the number of seconds that the service has spent in
its current state."
7: nSvcGroupName
Syntax="Octet String"
Descr="The short name of the servicegroup that this service belongs to. This
value is taken from the servicegroup_name directive in the servicegroup
definition. If the service belongs to more than one servicegroup this
object will contain the name of just one of them."
8: nSvcLastCheck
Syntax="Integer"
Descr="This is a timestamp in time_t format (seconds since the UNIX epoch)
indicating the time at which a check of the service was last performed."
9: nSvcLastChange
Syntax="Integer"
Descr="This is a timestamp in time_t format (seconds since the UNIX epoch)
indicating the time the service last changed state."
10: nSvcOutput
Syntax="Octet String"
Descr="The text output from the last service check (i.e. Ping OK)."
EDESC
#
#
Nagios SNMP nSvcEvent Trap Varbinds
Re: Nagios SNMP nSvcEvent Trap Varbinds
Currently, the way the trap sender is generating traps is that we are sending a custom snmptrap using version 2c and some of the variables located in the NAGIOS-NOTIFY-MIB. There are multiple different uses for the MIB and in many cases some variables will not be used or defined depending on what device or check is being run.
To clarify: How would you expect this to be handled, are you missing any data or information by not getting the missing pieces for that specific host/service? Let us know we will do our best to give you a hand with it.
To clarify: How would you expect this to be handled, are you missing any data or information by not getting the missing pieces for that specific host/service? Let us know we will do our best to give you a hand with it.
/Luke