Problems after 2014 Upgrade

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
User avatar
arnab.roy
Posts: 354
Joined: Sat Apr 30, 2011 10:24 am

Problems after 2014 Upgrade

Post by arnab.roy »

Hi Support,

Since the upgrade we are having major problems

Nagios is no longer processing external commands i.e. we are not being able to process snmp traps etc.

We are also seeing these messages

Jul 24 21:21:33 karma nagios: wproc: Core Worker 10396: job 41 (pid=13272) timed out. Killing it
Jul 24 21:21:33 karma nagios: wproc: Core Worker 10396: kill(-13272, SIGKILL) failed: Operation not permitted
Jul 24 21:21:34 karma nagios: wproc: Core Worker 10394: job 41 (pid=13286) timed out. Killing it
Jul 24 21:21:34 karma nagios: wproc: Core Worker 10394: kill(-13286, SIGKILL) failed: Operation not permitted


Whenever a trap comes in we see this in the error log

karma nagios: Error: External command failed -> PROCESS_SERVICE_CHECK_RESULT;wmin-aruba;SNMP Traps;0; 07 DE 07 18 14 19 0C 00 2B 00 00 10.100.119.25 2C 54 CF E5 BB 13 00 24 6C 57 E2 F0 WMIN-RAP2-ISLS 0 0 4 / wlsxTrapTime (OCTETSTR):07 DE 07 18 14 19 0C 00 2B 00 00 wlsxTrapUserIpAddress.0 (IPADDR):10.100.119.25 wlsxTrapUserPhyAddress.0 (OCTETSTR):2C 54 CF E5 BB 13 wlsxTrapAPBSSID.0 (OCTETSTR):00 24 6C 57 E2 F0 wlsxTrapAPName.0 (OCTETSTR):WMIN-RAP2-ISLS wlsxTrapCardSlot.0 (INTEGER32):0 wlsxTrapPortNumber.0 (INTEGER32):0 wlsxTrapUserAttributeChangeType.0 (INTEGER):4

Can we have some urgent feedback, otherwise might need to perform a rollback.

Many Thanks
Arnab
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Problems after 2014 Upgrade

Post by lmiltchev »

Were you able to actually complete the upgrade successfully? What is the output of the following commands?

Code: Select all

/usr/local/nagios/bin/nagios | head -2
/usr/local/nagios/bin/ndo2db | head -2
Do you have any config errors? Are you using mod gearman or mk livestatus?
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
snapon_admin
Posts: 952
Joined: Mon Jun 10, 2013 10:39 am
Location: Kenosha, WI
Contact:

Re: Problems after 2014 Upgrade

Post by snapon_admin »

I would also check:

Code: Select all

service snmptrapd status
service snmptt status
Your post made me realize we haven't received any traps in awhile so I checked these 2 things on our server and snmptt wasn't running. I started it and traps started streaming in. Not sure why it was stopped, but judging by the last time we received a trap it was right around the time we did the 2014 upgrade. Not sure if it's related or not, or even the same issue you're having, but I figured I'd throw in my 2 cents.
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Problems after 2014 Upgrade

Post by sreinhardt »

XI upgrades should not effect running services other than mysql, postgres, httpd, iptables, selinux, nagios, npcd, and probably mrtg. There really should be nothing to do with snmptt, snmptrapd, or snmpd, especially since we consider that separate integration. However I do find it interesting that you both report this seeming to be around update time.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
User avatar
arnab.roy
Posts: 354
Joined: Sat Apr 30, 2011 10:24 am

Re: Problems after 2014 Upgrade

Post by arnab.roy »

hi output is

Nagios Core 4.0.7

Ndo2db 2.0.0

It did throw up some errors for the nagios config i went in and cleaned up the core config then ran the upgrade. We are not using gearman or mkstatus , but we use nrdp and nsca. The problem we are observing is i think writing to nagios cmd. Because thats where the snmptrahandling.py scripts writes into.

Can you suggest any further troubleshooting steps.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Problems after 2014 Upgrade

Post by abrist »

What are the permissions on the command pipe?

Code: Select all

ls -la /usr/local/nagios/var/rw/
ls -lad /usr/local/nagios/var/rw/
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
User avatar
arnab.roy
Posts: 354
Joined: Sat Apr 30, 2011 10:24 am

Re: Problems after 2014 Upgrade

Post by arnab.roy »

Here you go

ls -la /usr/local/nagios/var/rw/
total 16
drwxrwsr-x 2 nagios nagcmd 4096 Jul 24 21:16 .
drwxrwxr-x 6 nagios nagios 4096 Jul 24 22:57 ..
prw-rw---- 1 nagios nagcmd 0 Jul 24 22:57 nagios.cmd
srw-rw---- 1 nagios nagcmd 0 Jul 24 21:16 nagios.qh
-rw-rw-r-- 1 nagios nagcmd 7137 Mar 14 16:04 nsca.dump


ls -lad /usr/local/nagios/var/rw/

drwxrwsr-x 2 nagios nagcmd 4096 Jul 24 21:16 /usr/local/nagios/var/rw/
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Problems after 2014 Upgrade

Post by abrist »

Well, dang. Those look fine. How about groups?

Code: Select all

grep nag /etc/group
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
User avatar
arnab.roy
Posts: 354
Joined: Sat Apr 30, 2011 10:24 am

Re: Problems after 2014 Upgrade

Post by arnab.roy »

I think i found the answers myself http://support.nagios.com/forum/viewtop ... 16&t=27376 on another thread , I will do some more testing and confirm if I am seeing the same issue or not
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Problems after 2014 Upgrade

Post by slansing »

Great let us know, sounds similar.
Locked