Moving from Core to XI - SNMP configuration
-
- Posts: 36
- Joined: Tue Jul 26, 2011 12:11 pm
Moving from Core to XI - SNMP configuration
Ack, just lost my last post due to login timeout, so here goes round 2.
Our current setup is passive:
- trap is sent from host in question
- snmptrapd receives the trap, formats it
- snmptt takes the trap and the config file says how to deal with it (WARNING, CRITICAL)
- submit_check_result sends to Nagios
The service is configured like this:
- set to passive
- uses 'check_host_alive' to set the 'ok' status
- if a WARNING or CRITICAL trap comes in from 'submit_check_result', status is changed to mirror that
- if no more traps come in, status is reset by the 'check_host_alive' to 'ok'
Note that this was configured 2-3 years ago.
Looking at our new XI server:
- automated install run for 'NagiosXI-SNMPTrap.sh'
- snmp config data has been migrated over (snmptrapd.conf, snmpd.conf, snmp.conf)
- mib has been added using addmib
- specific snmptt.conf.xyz files have also been migrated over (these contain the WARNING, CRITICAL settings)
- relevant hosts have been added
The next step is to set up how the XI system will receive and act on traps.
To get to the point - our old system used 'submit_check_result' (/usr/lib/nagios/plugins/eventhandlers/), but XI doesn't seem to have any eventhandlers directory, or that specific command. 'submit_check_result' uses 'nagios.cmd'.
Do I just copy this over and change the pathing?
Our current setup is passive:
- trap is sent from host in question
- snmptrapd receives the trap, formats it
- snmptt takes the trap and the config file says how to deal with it (WARNING, CRITICAL)
- submit_check_result sends to Nagios
The service is configured like this:
- set to passive
- uses 'check_host_alive' to set the 'ok' status
- if a WARNING or CRITICAL trap comes in from 'submit_check_result', status is changed to mirror that
- if no more traps come in, status is reset by the 'check_host_alive' to 'ok'
Note that this was configured 2-3 years ago.
Looking at our new XI server:
- automated install run for 'NagiosXI-SNMPTrap.sh'
- snmp config data has been migrated over (snmptrapd.conf, snmpd.conf, snmp.conf)
- mib has been added using addmib
- specific snmptt.conf.xyz files have also been migrated over (these contain the WARNING, CRITICAL settings)
- relevant hosts have been added
The next step is to set up how the XI system will receive and act on traps.
To get to the point - our old system used 'submit_check_result' (/usr/lib/nagios/plugins/eventhandlers/), but XI doesn't seem to have any eventhandlers directory, or that specific command. 'submit_check_result' uses 'nagios.cmd'.
Do I just copy this over and change the pathing?
Re: Moving from Core to XI - SNMP configuration
How we do it in XI is call the /usr/local/bin/snmptraphandling.py, which should be called like this:
So if it would be easy to edit the snmptt.conf to do it that way instead. Or, for minimal effort on your part you can use the submit_check_result script with XI as well by copying it from this script:
And use that as your normally would. I edited it to reflect where the nagios.cmd is on XI, otherwise, you should be able to use it as you used to use it, and put it anywhere you want (I would suggest the nagios/libexec directory.)
Code: Select all
usage: services.py <HOST> <SERVICE> <SEVERITY> <TIME> <PERFDATA> <DATA>
Code: Select all
#!/bin/sh
# SUBMIT_CHECK_RESULT
# Written by Ethan Galstad (nagios@nagios.org)
# Last Modified: 02-18-2002
#
# This script will write a command to the Nagios command
# file to cause Nagios to process a passive service check
# result. Note: This script is intended to be run on the
# same host that is running Nagios. If you want to
# submit passive check results from a remote machine, look
# at using the nsca addon.
#
# Arguments:
# $1 = host_name (Short name of host that the service is
# associated with)
# $2 = svc_description (Description of the service)
# $3 = return_code (An integer that determines the state
# of the service check, 0=OK, 1=WARNING, 2=CRITICAL,
# 3=UNKNOWN).
# $4 = plugin_output (A text string that should be used
# as the plugin output for the service check)
#
echocmd="/bin/echo"
CommandFile="/usr/local/nagios/var/rw/nagios.cmd"
# get the current date/time in seconds since UNIX epoch
datetime=`date +%s`
# create the command line to add to the command file
cmdline="[$datetime] PROCESS_SERVICE_CHECK_RESULT;$1;$2;$3;$4"
# append the command to the end of the command file
`$echocmd $cmdline >> $CommandFile`
Nicholas Scott
Former Nagios employee
Former Nagios employee
-
- Posts: 36
- Joined: Tue Jul 26, 2011 12:11 pm
Re: Moving from Core to XI - SNMP configuration
Great, I'll first try keeping our current config.
I copied in the script and ran sed to update the relevant snmptt.conf files with the new path, so far, so good.
However, for whatever reason SNMPTRAPD is refusing to restart (stop fails with pidof error). So I rebooted...snmptrapd is showing as running, but when I try to restart it...I get the pidof error.
-------------------------------------------------------------------------
[chris@tor-nagios-02 ~]$ sudo service snmptrapd restart
[sudo] password for chris:
Stopping snmptrapd: pidof: invalid options on command line!
pidof: invalid options on command line!
[FAILED]
Starting snmptrapd:
-------------------------------------------------------------------------
I tried reverting the snmptrapd.conf file to the original (empty), but no luck. Rebooted after doing that, and still same thing.
Any ideas? Anyone seen this behaviour before?
I copied in the script and ran sed to update the relevant snmptt.conf files with the new path, so far, so good.
However, for whatever reason SNMPTRAPD is refusing to restart (stop fails with pidof error). So I rebooted...snmptrapd is showing as running, but when I try to restart it...I get the pidof error.
-------------------------------------------------------------------------
[chris@tor-nagios-02 ~]$ sudo service snmptrapd restart
[sudo] password for chris:
Stopping snmptrapd: pidof: invalid options on command line!
pidof: invalid options on command line!
[FAILED]
Starting snmptrapd:
-------------------------------------------------------------------------
I tried reverting the snmptrapd.conf file to the original (empty), but no luck. Rebooted after doing that, and still same thing.
Any ideas? Anyone seen this behaviour before?
Re: Moving from Core to XI - SNMP configuration
Hmm, sounds like the /etc/init.d/snmptrapd got edited somehow? I put a question mark there because thats rather odd. But if its giving your improper command line argument error when starting using:
service snmptrapd start
Thats the conclusion I come to. It shouldn't, but does this start it successfully:
/etc/init.d/snmptrapd start
Also, open up your /etc/init.d/snmptrapd in some text editor and compare it to this one:
Are they different? Does using this one allow snmptrapd to start? Is snmpd started?
service snmptrapd start
Thats the conclusion I come to. It shouldn't, but does this start it successfully:
/etc/init.d/snmptrapd start
Also, open up your /etc/init.d/snmptrapd in some text editor and compare it to this one:
Code: Select all
#!/bin/bash
# ucd-snmp init file for snmptrapd
#
# chkconfig: - 50 50
# description: Simple Network Management Protocol (SNMP) Trap Daemon
#
# processname: /usr/sbin/snmptrapd
# config: /etc/snmp/snmptrapd.conf
# config: /usr/share/snmp/snmptrapd.conf
# pidfile: /var/run/snmptrapd.pid
### BEGIN INIT INFO
# Provides: snmptrapd
# Required-Start: $local_fs $network
# Required-Stop: $local_fs $network
# Should-Start:
# Should-Stop:
# Default-Start:
# Default-Stop:
# Short-Description: start and stop Net-SNMP trap daemon
# Description: Simple Network Management Protocol (SNMP) trap daemon
### END INIT INFO
# source function library
. /etc/init.d/functions
OPTIONS="-Lsd -On -p /var/run/snmptrapd.pid"
if [ -e /etc/sysconfig/snmptrapd ]; then
. /etc/sysconfig/snmptrapd
fi
RETVAL=0
prog="snmptrapd"
binary=/usr/sbin/snmptrapd
pidfile=/var/run/snmptrapd.pid
start() {
[ -x $binary ] || exit 5
echo -n $"Starting $prog: "
daemon --pidfile=$pidfile /usr/sbin/snmptrapd $OPTIONS
RETVAL=$?
echo
touch /var/lock/subsys/snmptrapd
return $RETVAL
}
stop() {
echo -n $"Stopping $prog: "
killproc -On -p $pidfile /usr/sbin/snmptrapd
RETVAL=$?
echo
rm -f /var/lock/subsys/snmptrapd
return $RETVAL
}
reload(){
stop
start
}
restart(){
stop
start
}
condrestart(){
[ -e /var/lock/subsys/snmptrapd ] && restart
return 0
}
case "$1" in
start)
start
RETVAL=$?
;;
stop)
stop
RETVAL=$?
;;
restart)
restart
RETVAL=$?
;;
reload|force-reload)
reload
RETVAL=$?
;;
condrestart|try-restart)
condrestart
RETVAL=$?
;;
status)
status snmptrapd
RETVAL=$?
;;
*)
echo $"Usage: $0 {start|stop|status|restart|condrestart|reload|force-reload}"
RETVAL=2
esac
exit $RETVAL
Nicholas Scott
Former Nagios employee
Former Nagios employee
-
- Posts: 36
- Joined: Tue Jul 26, 2011 12:11 pm
Re: Moving from Core to XI - SNMP configuration
Yeah, so that's the weird thing - it starts fine.
[root@tor-nagios-02 snmp]# service snmptrapd status
snmptrapd (pid 8992) is running...
I can kill the PID, then run the service snmptrapd start and it gives an ok and successful start.
If I try to do anything like stop, restart, or reload, I get the error.
Also, we're not receiving any traps, so clearly even though it's running, it's not working as it should.
[root@tor-nagios-02 snmp]# service snmptrapd status
snmptrapd (pid 8992) is running...
I can kill the PID, then run the service snmptrapd start and it gives an ok and successful start.
If I try to do anything like stop, restart, or reload, I get the error.
Also, we're not receiving any traps, so clearly even though it's running, it's not working as it should.
-
- Posts: 36
- Joined: Tue Jul 26, 2011 12:11 pm
Re: Moving from Core to XI - SNMP configuration
Hm, interesting. I checked out the 'functions' file, and found that it was not set to be executable, but on the old server it was set as executable.
I ran (chmod +x functions) and rebooted, but same thing, still can't restart the service as above.
I ran (chmod +x functions) and rebooted, but same thing, still can't restart the service as above.
-
- Posts: 36
- Joined: Tue Jul 26, 2011 12:11 pm
Re: Moving from Core to XI - SNMP configuration
I took that snmptrapd text and created a new snmptrapd file in /etc/init.d/, used that to get status, works. Used it to restart, fails with the same error.
The files are not identical, but either way, it fails, so that's not it.
The files are not identical, but either way, it fails, so that's not it.
-
- Posts: 36
- Joined: Tue Jul 26, 2011 12:11 pm
Re: Moving from Core to XI - SNMP configuration
Hm, progress....I think.
I am manually sending traps (as per: http://technotes.twosmallcoins.com/?p=369 ) and I'm seeing them in the logs!
This is from /var/log/snmptrapd.log :
couldn't open udp:1622011-10-20 13:34:32 localhost [127.0.0.1] (via UDP: [senderIPredact]:60640->[nagiosIPredact]) TRAP, SNMP v1, community REDACT
.1.3.6.1.4.1.2021.13.990 Enterprise Specific Trap (17) Uptime: 2 days, 21:47:37.75
.1.3.6.1.2.1.1.6.0 = STRING: test test test 5555
This is /var/log/messages:
Oct 20 13:09:31 tor-nagios-02 snmptrapd[4892]: getaddrinfo: status Name or service not known
Oct 20 13:09:31 tor-nagios-02 snmptrapd[4892]: getaddrinfo: status Name or service not known
Oct 20 13:09:31 tor-nagios-02 snmptrapd[4892]: getaddrinfo("status", NULL, ...): Name or service not known
Oct 20 13:09:31 tor-nagios-02 snmptrapd[4892]: getaddrinfo("status", NULL, ...): Name or service not known
Oct 20 13:09:31 tor-nagios-02 snmptrapd[4892]: couldn't open status -- errno 0 ("Success")
Oct 20 13:09:31 tor-nagios-02 snmptrapd[4892]: couldn't open status -- errno 0 ("Success")
Oct 20 13:09:53 tor-nagios-02 snmptrapd[4998]: couldn't open udp:162 -- errno 98 ("Address already in use")
Oct 20 13:09:53 tor-nagios-02 snmptrapd[4998]: couldn't open udp:162 -- errno 98 ("Address already in use")
Oct 20 13:34:32 tor-nagios-02 snmptrapd[1269]: 2011-10-20 13:34:32 localhost [127.0.0.1] (via UDP: [senderIPredact]:60640->[nagiosIPredact]) TRAP, SNMP v1, community REDACT1#012#011.1.3.6.1.4.1.2021.13.990 Enterprise Specific Trap (17) Uptime: 2 days, 21:47:37.75#012#011.1.3.6.1.2.1.1.6.0 = STRING: test test test 5555
Oct 20 13:38:11 tor-nagios-02 snmptt-sys[504]: Unable to delete trap file #snmptt-trap-1319132072798362 from spool dir
Does this help?
EDIT: Forgot some of the log.
I am manually sending traps (as per: http://technotes.twosmallcoins.com/?p=369 ) and I'm seeing them in the logs!
This is from /var/log/snmptrapd.log :
couldn't open udp:1622011-10-20 13:34:32 localhost [127.0.0.1] (via UDP: [senderIPredact]:60640->[nagiosIPredact]) TRAP, SNMP v1, community REDACT
.1.3.6.1.4.1.2021.13.990 Enterprise Specific Trap (17) Uptime: 2 days, 21:47:37.75
.1.3.6.1.2.1.1.6.0 = STRING: test test test 5555
This is /var/log/messages:
Oct 20 13:09:31 tor-nagios-02 snmptrapd[4892]: getaddrinfo: status Name or service not known
Oct 20 13:09:31 tor-nagios-02 snmptrapd[4892]: getaddrinfo: status Name or service not known
Oct 20 13:09:31 tor-nagios-02 snmptrapd[4892]: getaddrinfo("status", NULL, ...): Name or service not known
Oct 20 13:09:31 tor-nagios-02 snmptrapd[4892]: getaddrinfo("status", NULL, ...): Name or service not known
Oct 20 13:09:31 tor-nagios-02 snmptrapd[4892]: couldn't open status -- errno 0 ("Success")
Oct 20 13:09:31 tor-nagios-02 snmptrapd[4892]: couldn't open status -- errno 0 ("Success")
Oct 20 13:09:53 tor-nagios-02 snmptrapd[4998]: couldn't open udp:162 -- errno 98 ("Address already in use")
Oct 20 13:09:53 tor-nagios-02 snmptrapd[4998]: couldn't open udp:162 -- errno 98 ("Address already in use")
Oct 20 13:34:32 tor-nagios-02 snmptrapd[1269]: 2011-10-20 13:34:32 localhost [127.0.0.1] (via UDP: [senderIPredact]:60640->[nagiosIPredact]) TRAP, SNMP v1, community REDACT1#012#011.1.3.6.1.4.1.2021.13.990 Enterprise Specific Trap (17) Uptime: 2 days, 21:47:37.75#012#011.1.3.6.1.2.1.1.6.0 = STRING: test test test 5555
Oct 20 13:38:11 tor-nagios-02 snmptt-sys[504]: Unable to delete trap file #snmptt-trap-1319132072798362 from spool dir
Does this help?
EDIT: Forgot some of the log.
-
- Posts: 36
- Joined: Tue Jul 26, 2011 12:11 pm
Re: Moving from Core to XI - SNMP configuration
Update on this: Starting to look like while we have issues, the real reason why traps are not coming in is that the sending server was misconfigured/buggy.
Still trying to sort out why that service is failing to stop/restart however, along with the other issues present.
Still trying to sort out why that service is failing to stop/restart however, along with the other issues present.
Re: Moving from Core to XI - SNMP configuration
chris,
In your snmptt.ini folder, enable system logging and unknown trap logging. Then make the directory /var/log/snmptt/, then send a few traps to it. Tail both of those logs.
In your snmptt.ini folder, enable system logging and unknown trap logging. Then make the directory /var/log/snmptt/, then send a few traps to it. Tail both of those logs.
Nicholas Scott
Former Nagios employee
Former Nagios employee