Nagios Support Forum

Posted: **Mon Jul 08, 2019 4:01 pm**

I want to setup some passive checks based on SNMP Traps for some APC PDUs. All the documentation out there shows how to define a passive check after the trap has been received (as in unconfigured), and additionally only shows it per host. I need a method to do this ahead of receiving the trap because the event I need to monitor for is a power overload condition among others.

I have created some passive checks for the NCPA on Windows and after creating the passive checks successfully moved them over to a host group from a specific host. Will the same be possible with SNMP Traps? I'd rather not manage a separate passive check for every SNMP trap received, I have a lot of PDUs to setup for monitoring.

Posted: **Tue Jul 09, 2019 12:01 pm**

https://assets.nagios.com/downloads/nag ... h-NXTI.pdf covers how to setup trap definitions using NXTI. The definitions don't correlate to a specific host - they use $aR(by default) to hold the host IP and match it against a host configured in the CCM. $A is used to to define the hostname and can be used instead. See http://snmptt.sourceforge.net/docs/snmp ... stitutions for more info on variables.

Instead of waiting for a SNMP trap to come in to create a service check from it, you can manually create the passive service in the CCM and apply it to a hostgroup. Make sure service's description field matches the service description set in the NXTI definition.

Setting up passive checks is covered in https://assets.nagios.com/downloads/nag ... ios-XI.pdf. I would skip the part about using the wizard and just manually set it up.

Posted: **Wed Jul 10, 2019 11:00 am**

If you want to do it the old fashioned way edit /etc/snmp/snmptt.conf

you'll see all of the imported traps laid out like this-

Code: Select all

EVENT cefcPowerSupplyOutputChange .1.3.6.1.4.1.9.9.117.2.0.7 "Status Events" CRITICAL
FORMAT The notification indicates that the power $*
EXEC /usr/local/bin/snmptraphandling.py "$r" "servicename" "$s" "$@" "$-*" "The notification indicates that the power $*"
SDESC
The notification indicates that the power
supply's output capacity has changed.
This notification is triggered whenever one instance 
of the power supply's cefcPSOutputModeInOperation 
has transitioned from 'false' to 'true'.
Variables:
  1: entPhysicalName
  2: entPhysicalModelName
  3: cefcPSOutputModeCurrent
EDESC

This is the important bit-

Code: Select all

EXEC /usr/local/bin/snmptraphandling.py "$r" "servicename" "$s" "$@" "$-*" "The notification indicates that the power $*"

The first two variables are the most important

"$r" = the name of the host as a variable - don't change this (it automatically does a DNS lookup on the received IP of the trap, so make sure you have a reverse DNS in place for the IP or it wont show up)

"servicename" = change this to match the name of the service in Nagios - as long as the service exists on that host with the exact same name the trap will be sent there

Run:

Code: Select all

service snmptt restart

after any changes

Posted: **Wed Jul 10, 2019 2:13 pm**

Thanks for the second option, @optionstechnology!

Posted: **Wed Jul 10, 2019 4:22 pm**

We have the enterprise version of XI and for support reasons will probably be sticking with NXTI. Thanks so much for the additional info on both approaches, it's forcing an understanding of the parts doing the actual work which will help troubleshooting. It's also helping me ask some better questions.

Here's one of the APC trap definitions picked up in NXTI:

Code: Select all

EVENT rPDUBankPhaseNearOverload .1.3.6.1.4.1.318.0.224 "Status Events" WARNING
FORMAT Received trap "$N" with variables "$+*"
EXEC php /usr/local/nagiosxi/scripts/nxti.php --event_name="$N"  --event_oid="$i" --numeric_oid="$o" --symbolic_oid="$O" --community="$C" --trap_hostname="$R" --trap_ip="$aR" --agent_hostname="$A" --agent_ip="$aA" --category="$c" --severity="$s" --uptime="$T" --datetime="$x $X" --unixtime="$@" --bindings="$+*"
SDESC
WARNING: A bank or phase of the Rack PDU is near an overload condition.
The first argument is the serial number.
The second argument is the device name.
The third argument is the bank number (0 if this is phase data).
The fourth argument is the phase number (0 if this is bank data).
Variables:
  1: rPDUIdentSerialNumber
  2: rPDUIdentName
  3: rPDULoadStatusBankNumber
  4: rPDULoadStatusPhaseNumber
  5: mtrapargsString

EDESC

There is another trap for when an overload condition is reached, additionally there are the reverse traps to clear the alarms. Is there a way to set the warning and critical overload traps and clears to the same passive service? Essentially the passive service state is set by a combination of these 4 traps. Hopefully I'm not getting into an incredibly advanced topic here, this seems like a logical step, but I also realize I don't know what I don't know. I'm not so sure that I need anything to be processed by EXEC, unless this is exactly how to handle this type of situation.

The rest of the information detail on how the trap "Service Description" field under the Passive Service Setup in NXTI matching to the service description in the CCM makes total sense. I'm pretty sure this was the exact same sort of way to match data with the NCPA.

Posted: **Thu Jul 11, 2019 7:51 am**

After thinking about this off an on overnight, I think I may be over complicating this in my head.

Is what I'm after with warning/critical states and clearing those states as simple as defining the same service for each of the trap types? In other words, WARNING trap sets service to warning, CRITICAL trap sets service to critical, CRITICAL clear trap sets service to warning and WARNING clear trap sets service to OK?

Posted: **Thu Jul 11, 2019 12:13 pm**

Make sure to enable the "Enable Passive Service Setup" in the trap definition. That's what send the traps to XI and with it enabled we'd expect to see a definition with two EXEC commands. Something like:

Code: Select all

EVENT rPDUBankPhaseNearOverload .1.3.6.1.4.1.318.0.225 "Status Events" WARNING
FORMAT Received trap "$N" with variables "$+*"
EXEC php /usr/local/nagiosxi/scripts/nxti.php --event_name="$N"  --event_oid="$i" --numeric_oid="$o" --symbolic_oid="$O" --community="$C" --trap_hostname="$R" --trap_ip="$aR" --agent_hostname="$A" --agent_ip="$aA" --category="$c" --severity="$s" --uptime="$T" --datetime="$x $X" --unixtime="$@" --bindings="$+*"
EXEC /usr/local/bin/snmptraphandling.py "$aR" "Monitor Overload" "WARNING" "$@" "" "SNMP Trap Received at $@ with variab
SDESC
...
EDESC

The trap to clear the WARNING trap may use a different OID and the definition would look like:

Code: Select all

EVENT rPDUBankPhaseNearOverload .1.3.6.1.4.1.318.0.226 "Status Events" NORMAL
FORMAT Received trap "$N" with variables "$+*"
EXEC php /usr/local/nagiosxi/scripts/nxti.php --event_name="$N"  --event_oid="$i" --numeric_oid="$o" --symbolic_oid="$O" --community="$C" --trap_hostname="$R" --trap_ip="$aR" --agent_hostname="$A" --agent_ip="$aA" --category="$c" --severity="$s" --uptime="$T" --datetime="$x $X" --unixtime="$@" --bindings="$+*"
EXEC /usr/local/bin/snmptraphandling.py "$aR" "Monitor Overload" "OK" "$@" "" "SNMP Trap Received at $@ with variab
SDESC
...
EDESC

The important part that links these is the second EXEC that uses the service name of "Monitor Overload".

Posted: **Thu Jul 11, 2019 1:53 pm**

It helps to look at the files for the definitions and the go back and forth between the web GUI. I couldn't figure out why the Overload trap had two EXEC lines and the OverloadCleared only had one EXEC line. It's because the OverloadCleared didn't have the Passive Service Setup checked and configured for it. I was wondering what was generating the EXEC lines since I didn't have any defined in the additional EXEC lines a little lower on the trap configuration page.

I know the trap also returns some extended information to expand the variables out with $+*, that should be enough for me to get useful information on where the overload occurred. Running a report on state history for the check would be enough for me to see state changes.

It's absolutely the case that the OIDs are different for the Overload and OverloadCleared. Making sure that the Service Description matches in the passive service setup on the two traps is the glue that I'll need.

I believe I have this straight now, thanks for guidance!

Posted: **Fri Jul 12, 2019 9:22 am**

Glad to help!

Nagios Support Forum

SNMP Traps to Host Groups

SNMP Traps to Host Groups

Re: SNMP Traps to Host Groups

Re: SNMP Traps to Host Groups

Re: SNMP Traps to Host Groups

Re: SNMP Traps to Host Groups

Re: SNMP Traps to Host Groups

Re: SNMP Traps to Host Groups

Re: SNMP Traps to Host Groups

Re: SNMP Traps to Host Groups