Re: remove host from hostgroup but it still gets service che
Posted: Wed Apr 29, 2015 3:59 pm
Thank you for your diligence and corroboration on how nagios should run (as long as the user is doing what they are saying they are doing).
I did find the issue, and, I believe, you questioned me about it earlier in this thread. Nagios was _not_ being restarted. Someone, or somehow, the pid file (in our case: NagiosRunFile=/var/nagios/nagios.pid) did not exist.
I had been using 'restart' in our scripts and also tried 'reload' and 'force-reload' in my debugging efforts.
I've checked the init.d startup file (at least the one generated from and rpm built nagios) and see why it did not restart when the pid file is absent. Similar issues for reload and force reload.
I will be building some defense against such potential issues. FWIW I'll be doing stop and then start and if I don't get a good result from stop, I'll first rebuild the pid file (from a pgrep of the controller process (this one: nagios 1410 1 0 13:00 ? 00:00:23 /usr/bin/nagios -d /etc/nagios/nagios.cfg)) and then calling stop again. Should that not give joy, I'll do an xargs -n1 kill -15 and then start nagios.
I'm avoiding kill -9 for obvious reasons and -15 works fine. BTW, I'm running this under an HA configuration that's a bit different in that it tries really hard to keep it on the original master node and along with that and other things we do to remove/add nodes to the nagios config, we might be creating our own problem if we end up losing the pid file from 'too many heads in the soup'. Ordinarily I like to keep the hA controls to 1 head.
Would you suggest using reload instead of restart when it is config file changes alone that we are making (and we have our own built in 'sanity check', essentially running the 'pre-flight' check after any config file changes prior to restarting nagios)?
Again, thanks very much for your time and invaluable assistance, jdalrymple. I apologize for not confirming the 'restart' behavior after you mentioned that potential issue in a prior post in this topic.
The changes we were making "did work before and did not work now and we changed nothing". Always a head scratcher.
I did find the issue, and, I believe, you questioned me about it earlier in this thread. Nagios was _not_ being restarted. Someone, or somehow, the pid file (in our case: NagiosRunFile=/var/nagios/nagios.pid) did not exist.
I had been using 'restart' in our scripts and also tried 'reload' and 'force-reload' in my debugging efforts.
I've checked the init.d startup file (at least the one generated from and rpm built nagios) and see why it did not restart when the pid file is absent. Similar issues for reload and force reload.
I will be building some defense against such potential issues. FWIW I'll be doing stop and then start and if I don't get a good result from stop, I'll first rebuild the pid file (from a pgrep of the controller process (this one: nagios 1410 1 0 13:00 ? 00:00:23 /usr/bin/nagios -d /etc/nagios/nagios.cfg)) and then calling stop again. Should that not give joy, I'll do an xargs -n1 kill -15 and then start nagios.
I'm avoiding kill -9 for obvious reasons and -15 works fine. BTW, I'm running this under an HA configuration that's a bit different in that it tries really hard to keep it on the original master node and along with that and other things we do to remove/add nodes to the nagios config, we might be creating our own problem if we end up losing the pid file from 'too many heads in the soup'. Ordinarily I like to keep the hA controls to 1 head.
Would you suggest using reload instead of restart when it is config file changes alone that we are making (and we have our own built in 'sanity check', essentially running the 'pre-flight' check after any config file changes prior to restarting nagios)?
Again, thanks very much for your time and invaluable assistance, jdalrymple. I apologize for not confirming the 'restart' behavior after you mentioned that potential issue in a prior post in this topic.
The changes we were making "did work before and did not work now and we changed nothing". Always a head scratcher.