Moving from Core to XI - SNMP configuration

chris.trotter · Post by **chris.trotter** » Wed Oct 19, 2011 12:48 pm

Ack, just lost my last post due to login timeout, so here goes round 2.

Our current setup is passive:
- trap is sent from host in question
- snmptrapd receives the trap, formats it
- snmptt takes the trap and the config file says how to deal with it (WARNING, CRITICAL)
- submit_check_result sends to Nagios

The service is configured like this:
- set to passive
- uses 'check_host_alive' to set the 'ok' status
- if a WARNING or CRITICAL trap comes in from 'submit_check_result', status is changed to mirror that
- if no more traps come in, status is reset by the 'check_host_alive' to 'ok'

Note that this was configured 2-3 years ago.

Looking at our new XI server:
- automated install run for 'NagiosXI-SNMPTrap.sh'
- snmp config data has been migrated over (snmptrapd.conf, snmpd.conf, snmp.conf)
- mib has been added using addmib
- specific snmptt.conf.xyz files have also been migrated over (these contain the WARNING, CRITICAL settings)
- relevant hosts have been added

The next step is to set up how the XI system will receive and act on traps.

To get to the point - our old system used 'submit_check_result' (/usr/lib/nagios/plugins/eventhandlers/), but XI doesn't seem to have any eventhandlers directory, or that specific command. 'submit_check_result' uses 'nagios.cmd'.

Do I just copy this over and change the pathing?

Post by **nscott** » Wed Oct 19, 2011 1:13 pm

How we do it in XI is call the /usr/local/bin/snmptraphandling.py, which should be called like this:

Code: Select all

usage: services.py <HOST> <SERVICE> <SEVERITY> <TIME> <PERFDATA> <DATA>

So if it would be easy to edit the snmptt.conf to do it that way instead. Or, for minimal effort on your part you can use the submit_check_result script with XI as well by copying it from this script:

Code: Select all

    #!/bin/sh

    # SUBMIT_CHECK_RESULT
    # Written by Ethan Galstad (nagios@nagios.org)
    # Last Modified: 02-18-2002
    #
    # This script will write a command to the Nagios command
    # file to cause Nagios to process a passive service check
    # result.  Note: This script is intended to be run on the
    # same host that is running Nagios.  If you want to
    # submit passive check results from a remote machine, look
    # at using the nsca addon.
    #
    # Arguments:
    #  $1 = host_name (Short name of host that the service is
    #       associated with)
    #  $2 = svc_description (Description of the service)
    #  $3 = return_code (An integer that determines the state
    #       of the service check, 0=OK, 1=WARNING, 2=CRITICAL,
    #       3=UNKNOWN).
    #  $4 = plugin_output (A text string that should be used
    #       as the plugin output for the service check)
    #

    echocmd="/bin/echo"

    CommandFile="/usr/local/nagios/var/rw/nagios.cmd"

    # get the current date/time in seconds since UNIX epoch
    datetime=`date +%s`

    # create the command line to add to the command file
    cmdline="[$datetime] PROCESS_SERVICE_CHECK_RESULT;$1;$2;$3;$4"

    # append the command to the end of the command file
    `$echocmd $cmdline >> $CommandFile`

And use that as your normally would. I edited it to reflect where the nagios.cmd is on XI, otherwise, you should be able to use it as you used to use it, and put it anywhere you want (I would suggest the nagios/libexec directory.)

chris.trotter · Post by **chris.trotter** » Thu Oct 20, 2011 10:34 am

Great, I'll first try keeping our current config.

I copied in the script and ran sed to update the relevant snmptt.conf files with the new path, so far, so good.

However, for whatever reason SNMPTRAPD is refusing to restart (stop fails with pidof error). So I rebooted...snmptrapd is showing as running, but when I try to restart it...I get the pidof error.

-------------------------------------------------------------------------
[chris@tor-nagios-02 ~]$ sudo service snmptrapd restart
[sudo] password for chris:
Stopping snmptrapd: pidof: invalid options on command line!

pidof: invalid options on command line!

[FAILED]
Starting snmptrapd:
-------------------------------------------------------------------------

I tried reverting the snmptrapd.conf file to the original (empty), but no luck. Rebooted after doing that, and still same thing.

Any ideas? Anyone seen this behaviour before?

Post by **nscott** » Thu Oct 20, 2011 11:33 am

Hmm, sounds like the /etc/init.d/snmptrapd got edited somehow? I put a question mark there because thats rather odd. But if its giving your improper command line argument error when starting using:

service snmptrapd start

Thats the conclusion I come to. It shouldn't, but does this start it successfully:

/etc/init.d/snmptrapd start

Also, open up your /etc/init.d/snmptrapd in some text editor and compare it to this one:

Code: Select all

#!/bin/bash

# ucd-snmp init file for snmptrapd
#
# chkconfig: - 50 50
# description: Simple Network Management Protocol (SNMP) Trap Daemon
#
# processname: /usr/sbin/snmptrapd
# config: /etc/snmp/snmptrapd.conf
# config: /usr/share/snmp/snmptrapd.conf
# pidfile: /var/run/snmptrapd.pid


### BEGIN INIT INFO
# Provides: snmptrapd
# Required-Start: $local_fs $network
# Required-Stop: $local_fs $network
# Should-Start:
# Should-Stop:
# Default-Start:
# Default-Stop:
# Short-Description: start and stop Net-SNMP trap daemon
# Description: Simple Network Management Protocol (SNMP) trap daemon
### END INIT INFO

# source function library
. /etc/init.d/functions

OPTIONS="-Lsd -On -p /var/run/snmptrapd.pid"
if [ -e /etc/sysconfig/snmptrapd ]; then
  . /etc/sysconfig/snmptrapd
fi

RETVAL=0
prog="snmptrapd"
binary=/usr/sbin/snmptrapd
pidfile=/var/run/snmptrapd.pid

start() {
	[ -x $binary ] || exit 5
	echo -n $"Starting $prog: "
        daemon --pidfile=$pidfile /usr/sbin/snmptrapd $OPTIONS
	RETVAL=$?
	echo
	touch /var/lock/subsys/snmptrapd
	return $RETVAL
}

stop() {
	echo -n $"Stopping $prog: "
	killproc -On -p $pidfile /usr/sbin/snmptrapd
	RETVAL=$?
	echo
	rm -f /var/lock/subsys/snmptrapd
	return $RETVAL
}

reload(){
	stop
	start
}

restart(){
	stop
	start
}

condrestart(){
    [ -e /var/lock/subsys/snmptrapd ] && restart
    return 0
}

case "$1" in
  start)
	start
	RETVAL=$?
	;;
  stop)
	stop
	RETVAL=$?
	;;
  restart)
	restart
	RETVAL=$?
        ;;
  reload|force-reload)
	reload
	RETVAL=$?
        ;;
  condrestart|try-restart)
	condrestart
	RETVAL=$?
	;;
  status)
        status snmptrapd
	RETVAL=$?
        ;;
  *)
	echo $"Usage: $0 {start|stop|status|restart|condrestart|reload|force-reload}"
	RETVAL=2
esac

exit $RETVAL

Are they different? Does using this one allow snmptrapd to start? Is snmpd started?

chris.trotter · Post by **chris.trotter** » Thu Oct 20, 2011 11:54 am

Yeah, so that's the weird thing - it starts fine.

[root@tor-nagios-02 snmp]# service snmptrapd status
snmptrapd (pid 8992) is running...

I can kill the PID, then run the service snmptrapd start and it gives an ok and successful start.

If I try to do anything like stop, restart, or reload, I get the error.

Also, we're not receiving any traps, so clearly even though it's running, it's not working as it should.

chris.trotter · Post by **chris.trotter** » Thu Oct 20, 2011 12:03 pm

Hm, interesting. I checked out the 'functions' file, and found that it was not set to be executable, but on the old server it was set as executable.

I ran (chmod +x functions) and rebooted, but same thing, still can't restart the service as above.

chris.trotter · Post by **chris.trotter** » Thu Oct 20, 2011 12:17 pm

I took that snmptrapd text and created a new snmptrapd file in /etc/init.d/, used that to get status, works. Used it to restart, fails with the same error.

The files are not identical, but either way, it fails, so that's not it.

chris.trotter · Post by **chris.trotter** » Thu Oct 20, 2011 12:39 pm

Hm, progress....I think.

I am manually sending traps (as per: http://technotes.twosmallcoins.com/?p=369 ) and I'm seeing them in the logs!

This is from /var/log/snmptrapd.log :

couldn't open udp:1622011-10-20 13:34:32 localhost [127.0.0.1] (via UDP: [senderIPredact]:60640->[nagiosIPredact]) TRAP, SNMP v1, community REDACT
.1.3.6.1.4.1.2021.13.990 Enterprise Specific Trap (17) Uptime: 2 days, 21:47:37.75
.1.3.6.1.2.1.1.6.0 = STRING: test test test 5555

This is /var/log/messages:
Oct 20 13:09:31 tor-nagios-02 snmptrapd[4892]: getaddrinfo: status Name or service not known
Oct 20 13:09:31 tor-nagios-02 snmptrapd[4892]: getaddrinfo: status Name or service not known
Oct 20 13:09:31 tor-nagios-02 snmptrapd[4892]: getaddrinfo("status", NULL, ...): Name or service not known
Oct 20 13:09:31 tor-nagios-02 snmptrapd[4892]: getaddrinfo("status", NULL, ...): Name or service not known
Oct 20 13:09:31 tor-nagios-02 snmptrapd[4892]: couldn't open status -- errno 0 ("Success")
Oct 20 13:09:31 tor-nagios-02 snmptrapd[4892]: couldn't open status -- errno 0 ("Success")
Oct 20 13:09:53 tor-nagios-02 snmptrapd[4998]: couldn't open udp:162 -- errno 98 ("Address already in use")
Oct 20 13:09:53 tor-nagios-02 snmptrapd[4998]: couldn't open udp:162 -- errno 98 ("Address already in use")
Oct 20 13:34:32 tor-nagios-02 snmptrapd[1269]: 2011-10-20 13:34:32 localhost [127.0.0.1] (via UDP: [senderIPredact]:60640->[nagiosIPredact]) TRAP, SNMP v1, community REDACT1#012#011.1.3.6.1.4.1.2021.13.990 Enterprise Specific Trap (17) Uptime: 2 days, 21:47:37.75#012#011.1.3.6.1.2.1.1.6.0 = STRING: test test test 5555
Oct 20 13:38:11 tor-nagios-02 snmptt-sys[504]: Unable to delete trap file #snmptt-trap-1319132072798362 from spool dir

Does this help?

EDIT: Forgot some of the log.

chris.trotter · Post by **chris.trotter** » Thu Oct 20, 2011 2:27 pm

Update on this: Starting to look like while we have issues, the real reason why traps are not coming in is that the sending server was misconfigured/buggy.

Still trying to sort out why that service is failing to stop/restart however, along with the other issues present.

Post by **nscott** » Fri Oct 21, 2011 10:14 am

chris,

In your snmptt.ini folder, enable system logging and unknown trap logging. Then make the directory /var/log/snmptt/, then send a few traps to it. Tail both of those logs.

Nagios Support Forum

Moving from Core to XI - SNMP configuration

Moving from Core to XI - SNMP configuration

Re: Moving from Core to XI - SNMP configuration

Re: Moving from Core to XI - SNMP configuration

Re: Moving from Core to XI - SNMP configuration

Re: Moving from Core to XI - SNMP configuration

Re: Moving from Core to XI - SNMP configuration

Re: Moving from Core to XI - SNMP configuration

Re: Moving from Core to XI - SNMP configuration

Re: Moving from Core to XI - SNMP configuration

Re: Moving from Core to XI - SNMP configuration