Page 2 of 2

Re: Ghost hosts

Posted: Wed Aug 04, 2010 10:19 am
by mmestnik
We are collecting information from you, please be patient. We don't need to be able to reproduce your issue, but obviously that would be of great use. As we are not yet able to reproduce your issue we can only resort to asking for your assistance.

You have found the Nagios Core interface, it's the last image where that Black navbar is on the left.

You indicate that the Nagios Core server is not reloading it's configuration. What effect does a restart on Nagios from the Administrative web interface(It's a column of hopefully Green dots on the home page),the top one should have a menu to manage it. Also something like this from a shell 'service nagios stop; service nagios start;'... have? You should endeavor to make use of other shell commands if these steps should fail. Among others(make your own) try these.

Code: Select all

ps 1 $(pgrep nagios); ps 1 $(pgrep -f nagios);
# kill $(pgrep nagios); # Uncomment this one, when you are ready.
service nagios stop; service nagios start;
I've updated the FAQ on this issue because there was a step missing. You can re-read the CCM issues FAQ or this new one.
Do these first

Re: Ghost hosts

Posted: Thu Aug 05, 2010 9:25 am
by awatch
It looks like the issue was from nagios processes which never cleanly shut down.
doing a ps -e | grep nagios showed about 15 nagios processes, some of which were defunct.
kill $(pgrep nagios) did the trick.

Thank you!

Re: Ghost hosts

Posted: Thu Aug 05, 2010 1:39 pm
by mmestnik
I've seen this b4 in Nagios 2.x. I'll add this to the list of fixes, perhaps many of the Ghost Host issues were related to this.

A reboot would have fix this(that is corrected intended behavior), but rebooting always breaks much more then it fixes with Unix systems.

Re: Ghost hosts

Posted: Thu Aug 05, 2010 6:28 pm
by admin
It sounds like this could also be due to a second (older) nagios process that is running with an old config. If this is the case, there are two copies of Nagios Core running - one with the old config, one with the new. They'll fight over each other in updating the current status information, so sometimes you'll see a host/service, sometimes you won't.

If you catch this when it happens, can you please provide the following information to us:

1. Run the following commands to get a list of all config files and their timestamps:

Code: Select all

ls -al /usr/local/nagios/etc/services/*.cfg
ls -al /usr/local/nagios/etc/hosts/*.cfg
2. Let us know if the timestamp of the config file that contains the "ghost" host/service is older than the rest of the files. This could indicate there is a problem somewhere in the NagiosQL logic.

Re: Ghost hosts

Posted: Fri Aug 06, 2010 8:23 am
by awatch
What I believe was causing my particular issue was that some time ago, while applying configuration settings, a second naguis process was spawned with the then current configuration files. Some time passed and edits were made to the configuration files and when the changes were pushed through another nagios process was spawned for some reason. So, even though the old configuration files were gone, the nagios process which was running with information from those files was still active. So, yes, definitely you're right about what the issue was.

Of course a reboot would have fixed it, but as said before it tends to break more than it fixes with unix systems. :\

Thanks again!