OS: Centos 5.5
Version: Nagios XI 2009R1.4B
Problem:
This particular bug is showing up frequently enough where I need to report it so it can be looked at. I consider this one to be rather serious.
We have many people working on the system now and applying changes all the time. The problem now is that we keep getting duplicate nagios parents running on our system.
Duplication:
All we need to do is make changes to something inside Nagios and give it a restart, then I run this command:
ps -ef | grep "/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg" | grep -v grep
It will display something like this:
nagios 27657 1 3 09:43 ? 00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
with a ton of children
and
nagios 1024 1 3 09:43 ? 00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
with it's children
this causes the "process was orphaned" problem in the /usr/local/nagios/var/nagios.log file. Also, we noticed that we cannot view any notifications from yesterday (3-22 0:00:00 - 3:23 0:00:00) which we think might be related to this issue.
Any ideas on how I can stop the duplicate nagios service from appearing?
Bug with two nagios services starting
Re: Bug with two nagios services starting
Would this have anything to do with it? I stopped and started the nagios service with this command earlier this week (as root): service nagios restart.
-
tonyyarusso
- Posts: 1128
- Joined: Wed Mar 03, 2010 12:38 pm
- Location: St. Paul, MN, USA
- Contact:
Re: Bug with two nagios services starting
Could you define "many"? I'm curious whether there's a flaw in the process locking logic or some kind of race condition.We have many people working on the system now
No, that should honor the same locking system as everything else.Would this have anything to do with it? I stopped and started the nagios service with this command earlier this week (as root): service nagios restart.
Re: Bug with two nagios services starting
It's hard to tell, but during the day we probably have about 20 people in the system or more. Currently we have 80 users that regularly log in. 10 of those regularly make changes.
Re: Bug with two nagios services starting
We had that issue a while back, but we posted a fix and hadn't seen the issue for quite some time. Is this something that seems to be surfacing recently (since any particular upgrade)?
We'll do some investigating on this.
A system with that many active XI users is somewhat unique. As more of a favor, would you be willing to PM me with a system profile of your monitoring environment? (hosts + services, hardware specs, any special system configurations). We're trying to get a better sense of system capabilites and hardware requirements for XI.
We'll do some investigating on this.
A system with that many active XI users is somewhat unique. As more of a favor, would you be willing to PM me with a system profile of your monitoring environment? (hosts + services, hardware specs, any special system configurations). We're trying to get a better sense of system capabilites and hardware requirements for XI.
Re: Bug with two nagios services starting
Yeah I can do that. What's the best way to get you that information?mguthrie wrote:We had that issue a while back, but we posted a fix and hadn't seen the issue for quite some time. Is this something that seems to be surfacing recently (since any particular upgrade)?
We'll do some investigating on this.
A system with that many active XI users is somewhat unique. As more of a favor, would you be willing to PM me with a system profile of your monitoring environment? (hosts + services, hardware specs, any special system configurations). We're trying to get a better sense of system capabilites and hardware requirements for XI.
Re: Bug with two nagios services starting
Select the PM button next to my name to send me a personal message. Thanks!
Re: Bug with two nagios services starting
I'm still working on this one, but we created a nagios check to help us find out when this problem occurs.
Re: Bug with two nagios services starting
Just send that info when you can as we know you are addressing a few different issues.
Re: Bug with two nagios services starting
Ok, here's what we have figured out so far:
The dual process seems to be related to when people are applying changes in nagios. It's possible that two people click on the apply configuration button roughly the same time. The other possibility is that it doesn't quite kill the old nagios process when changes are being applied.
The dual process seems to be related to when people are applying changes in nagios. It's possible that two people click on the apply configuration button roughly the same time. The other possibility is that it doesn't quite kill the old nagios process when changes are being applied.