Page 1 of 1

snmptraphandling.py output not liked by the external command

Posted: Mon Oct 12, 2015 1:59 pm
by gormank
The trap does show up on the host though.

nagios.log:
[1444675535] Error: External command failed -> PROCESS_SERVICE_CHECK_RESULT;txslm2mwspr001;SNMP Traps;0;Health Status Array Change occurred (11020): A change in the health status of the server has occurred, the status is now 02 02 02 02 02 02 01 02 02 02 02 02 02 02 02 02 02 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 / sysName.0 (OCTETSTR):TXSLM2MWADS002 enterprises.232.11.2.11.1.0 ():4 enterprises.232.11.2.10.7.0 ():02 02 02 02 02 02 01 02 02 02 02 02 02 02 02 02 02 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[1444675535] External command error: Command failed

Nagios is v2.6

Re: snmptraphandling.py output not liked by the external com

Posted: Mon Oct 12, 2015 3:32 pm
by tgriep
Could you post your /etc/snmp/snmptt.conf file so we can review it?
Is the time in sync between the Xi system and the remote device that is sending the TRAP?

Re: snmptraphandling.py output not liked by the external com

Posted: Mon Oct 12, 2015 5:09 pm
by gormank
Time is synced.
Here's the file.

Thanks

Re: snmptraphandling.py output not liked by the external com

Posted: Mon Oct 12, 2015 6:02 pm
by Box293
Here's the event:

Code: Select all

EVENT cpqHoMibHealthStatusArrayChangeTrap .1.3.6.1.4.1.232.0.11020 "Status Events" INFORMATIONAL
FORMAT Health Status Array Change occurred (11020): A change in the health status of the server has occurred, the status is now $3
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "$-*" "Health Status Array Change occurred (11020): A change in the health status of the server has occurred, the status is now $3"
SDESC
A change in the cpqHoMibHealthStatusArray has occurred.
Variables:
  1: sysName
  2: cpqHoTrapFlags
  3: cpqHoMibHealthStatusArray
EDESC
As a test, can you change the EXEC line so it is as follows:

Code: Select all

EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "" "Health Status Array Change occurred (11020): A change in the health status of the server has occurred, the status is now $3"
Basically I removed $-* (the double quotes are still required).

Then:

Code: Select all

service snmptt restart
Does this work?

If not, can you change the EXEC line to:

Code: Select all

EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "" "Health Status Array Change occurred"
Then:

Code: Select all

service snmptt restart
Does this work?

I'm just trying to pinpoint if the received data in $3 is causing issues.

Re: snmptraphandling.py output not liked by the external com

Posted: Tue Oct 13, 2015 12:38 pm
by gormank
Note that this seems to be intermittant, but after looking in the archived logs I see >100k occurrances since June. It seems to log the error and the trap, and then just the error, so divide that count in half. Here's where it did and didn't do fail (log attached):

[Mon Oct 12 19:34:13 2015] Error: External command failed -> PROCESS_SERVICE_CHECK_RESULT;txslm2mwapp005;SNMP Traps;0;Health Status Array Change occurred (11020): A change in the health status of the server has occurred, the status is now 02 02 02 02 02 02 01 02 02 02 02 02 02 02 02 02 02 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 / sysName.0 (OCTETSTR):TXSLM2MWAPP005 enterprises.232.11.2.11.1.0 ():4 enterprises.232.11.2.10.7.0 ():02 02 02 02 02 02 01 02 02 02 02 02 02 02 02 02 02 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[Mon Oct 12 19:34:13 2015] External command error: Command failed

[Tue Oct 13 17:15:57 2015] SERVICE ALERT: txslm2mwapp005.ilo;SNMP Traps;OK;HARD;1;Health Status Array Change occurred (11020): A change in the health status of the server has occurred, the status is now 02 02 02 02 02 02 01 02 02 02 02 02 02 02 02 02 02 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Its still happening after the first change (remove $-* but leave the quotes).
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "" "Health Status Array Change occurred (11020): A change in the health status of the server has occurred, the status is now $3"

[Tue Oct 13 17:15:57 2015] SERVICE ALERT: txslm2mwapp005.ilo;SNMP Traps;OK;HARD;1;Health Status Array Change occurred (11020): A change in the health status of the server has occurred, the status is now 02 02 02 02
02 02 01 02 02 02 02 02 02 02 02 02 02 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[Tue Oct 13 17:16:02 2015] SERVICE ALERT: txslm2mwftp002.ilo;SNMP Traps;OK;HARD;1;Health Status Array Change occurred (11020): A change in the health status of the server has occurred, the status is now 02 02 02 02
02 02 01 02 02 02 02 02 02 02 02 02 02 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[Tue Oct 13 17:16:42 2015] SERVICE ALERT: txslm2mlapp008.ilo;SNMP Traps;OK;HARD;1;Health Status Array Change occurred (11020): A change in the health status of the server has occurred, the status is now 02 02 02 02
02 02 01 02 02 02 00 02 00 00 01 02 02 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[Tue Oct 13 17:17:12 2015] SERVICE ALERT: txslm2mlapp008.ilo;SNMP Traps;OK;HARD;1;Health Status Array Change occurred (11020): A change in the health status of the server has occurred, the status is now 02 02 02 02
02 02 01 02 02 02 00 02 00 00 01 02 02 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[Tue Oct 13 17:18:38 2015] Warning: Passive check result was received for service 'SNMP Traps' on host '10.133.133.17', but the host could not be found!
[Tue Oct 13 17:18:38 2015] Error: External command failed -> PROCESS_SERVICE_CHECK_RESULT;10.133.133.17;SNMP Traps;0;Health Status Array Change occurred (11020): A change in the health status of the server has occurred, the status is now 02 02 02 02 02 02 01 02 02 02 00 02 00 00 01 02 02 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[Tue Oct 13 17:18:38 2015] External command error: Command failed
[Tue Oct 13 17:19:13 2015] Warning: Passive check result was received for service 'SNMP Traps' on host '10.133.133.17', but the host could not be found!
[Tue Oct 13 17:19:13 2015] Error: External command failed -> PROCESS_SERVICE_CHECK_RESULT;10.133.133.17;SNMP Traps;0;Health Status Array Change occurred (11020): A change in the health status of the server has occurred, the status is now 02 02 02 02 02 02 01 02 02 02 00 02 00 00 01 02 02 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[Tue Oct 13 17:19:13 2015] External command error: Command failed
[Tue Oct 13 17:19:28 2015] SERVICE ALERT: txslm2mwaaa001.ilo;SNMP Traps;OK;HARD;1;Health Status Array Change occurred (11020): A change in the health status of the server has occurred, the status is now 02 02 02 02 02 02 01 02 02 02 02 02 02 02 02 02 02 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[Tue Oct 13 17:19:58 2015] SERVICE ALERT: txslm2mwaaa001.ilo;SNMP Traps;OK;HARD;1;Health Status Array Change occurred (11020): A change in the health status of the server has occurred, the status is now 02 02 02 02 02 02 01 02 02 02 02 02 02 02 02 02 02 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Changing to the following and waiting:
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps" "$s" "$@" "" "Health Status Array Change occurred"

Still there:

[Tue Oct 13 19:30:56 2015] Error: External command failed -> PROCESS_SERVICE_CHECK_RESULT;10.133.133.17;SNMP Traps;0;Health Status Array Change occurred
[Tue Oct 13 19:30:56 2015] External command error: Command failed
[Tue Oct 13 19:31:26 2015] Warning: Passive check result was received for service 'SNMP Traps' on host '10.133.133.17', but the host could not be found!
[Tue Oct 13 19:31:26 2015] Error: External command failed -> PROCESS_SERVICE_CHECK_RESULT;10.133.133.17;SNMP Traps;0;Health Status Array Change occurred
[Tue Oct 13 19:31:26 2015] External command error: Command failed

Re: snmptraphandling.py output not liked by the external com

Posted: Tue Oct 13, 2015 3:06 pm
by tgriep
Is there a host configuration file for the IP address of 10.133.133.17? From the error message, it looks like it is missing.
If you go in the XI GUI under Admin > Unconfigured Objects, if the object there and can you configure it?

Re: snmptraphandling.py output not liked by the external com

Posted: Wed Oct 14, 2015 11:16 am
by gormank
I did find a wrong address yesterday so that might be it. .17 is in a host file with yesterday's date so that must be the one. I also don't see the error in today's log file.
I wouldn't think an unconfigured (or misconfigured in this case) object would cause that error, but you can close this since it seems to have stopped...

Thanks!

Re: snmptraphandling.py output not liked by the external com

Posted: Wed Oct 14, 2015 11:23 am
by rkennedy
Sometimes it's the little things that throw it all off. It's good to see this is resolved. Feel free to open another thread if the issue persists.