Bug with two nagios services starting

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
User avatar
niebais
Posts: 349
Joined: Tue Apr 13, 2010 2:15 pm

Bug with two nagios services starting

Post by niebais »

OS: Centos 5.5
Version: Nagios XI 2009R1.4B

Problem:
This particular bug is showing up frequently enough where I need to report it so it can be looked at. I consider this one to be rather serious.

We have many people working on the system now and applying changes all the time. The problem now is that we keep getting duplicate nagios parents running on our system.

Duplication:
All we need to do is make changes to something inside Nagios and give it a restart, then I run this command:
ps -ef | grep "/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg" | grep -v grep

It will display something like this:
nagios 27657 1 3 09:43 ? 00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
with a ton of children

and
nagios 1024 1 3 09:43 ? 00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
with it's children

this causes the "process was orphaned" problem in the /usr/local/nagios/var/nagios.log file. Also, we noticed that we cannot view any notifications from yesterday (3-22 0:00:00 - 3:23 0:00:00) which we think might be related to this issue.

Any ideas on how I can stop the duplicate nagios service from appearing?
User avatar
niebais
Posts: 349
Joined: Tue Apr 13, 2010 2:15 pm

Re: Bug with two nagios services starting

Post by niebais »

Would this have anything to do with it? I stopped and started the nagios service with this command earlier this week (as root): service nagios restart.
tonyyarusso
Posts: 1128
Joined: Wed Mar 03, 2010 12:38 pm
Location: St. Paul, MN, USA
Contact:

Re: Bug with two nagios services starting

Post by tonyyarusso »

We have many people working on the system now
Could you define "many"? I'm curious whether there's a flaw in the process locking logic or some kind of race condition.
Would this have anything to do with it? I stopped and started the nagios service with this command earlier this week (as root): service nagios restart.
No, that should honor the same locking system as everything else.
Tony Yarusso
Technical Services
___
TIES
Web: http://ties.k12.mn.us/
User avatar
niebais
Posts: 349
Joined: Tue Apr 13, 2010 2:15 pm

Re: Bug with two nagios services starting

Post by niebais »

It's hard to tell, but during the day we probably have about 20 people in the system or more. Currently we have 80 users that regularly log in. 10 of those regularly make changes.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Bug with two nagios services starting

Post by mguthrie »

We had that issue a while back, but we posted a fix and hadn't seen the issue for quite some time. Is this something that seems to be surfacing recently (since any particular upgrade)?

We'll do some investigating on this.


A system with that many active XI users is somewhat unique. As more of a favor, would you be willing to PM me with a system profile of your monitoring environment? (hosts + services, hardware specs, any special system configurations). We're trying to get a better sense of system capabilites and hardware requirements for XI.
User avatar
niebais
Posts: 349
Joined: Tue Apr 13, 2010 2:15 pm

Re: Bug with two nagios services starting

Post by niebais »

mguthrie wrote:We had that issue a while back, but we posted a fix and hadn't seen the issue for quite some time. Is this something that seems to be surfacing recently (since any particular upgrade)?

We'll do some investigating on this.


A system with that many active XI users is somewhat unique. As more of a favor, would you be willing to PM me with a system profile of your monitoring environment? (hosts + services, hardware specs, any special system configurations). We're trying to get a better sense of system capabilites and hardware requirements for XI.
Yeah I can do that. What's the best way to get you that information?
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Bug with two nagios services starting

Post by mguthrie »

Select the PM button next to my name to send me a personal message. Thanks!
User avatar
niebais
Posts: 349
Joined: Tue Apr 13, 2010 2:15 pm

Re: Bug with two nagios services starting

Post by niebais »

I'm still working on this one, but we created a nagios check to help us find out when this problem occurs.
rdedon
Posts: 578
Joined: Sat Nov 20, 2010 4:51 pm

Re: Bug with two nagios services starting

Post by rdedon »

Just send that info when you can as we know you are addressing a few different issues.
Rene deDon
Technical Team
___
Nagios Enterprises, LLC
Web: http://www.nagios.com
User avatar
niebais
Posts: 349
Joined: Tue Apr 13, 2010 2:15 pm

Re: Bug with two nagios services starting

Post by niebais »

Ok, here's what we have figured out so far:

The dual process seems to be related to when people are applying changes in nagios. It's possible that two people click on the apply configuration button roughly the same time. The other possibility is that it doesn't quite kill the old nagios process when changes are being applied.
Locked