Page 1 of 1
Multiple Nagios Processes
Posted: Wed Nov 04, 2015 12:42 am
by Fred Kroeger
I've noticed that since upgrading to NagiosXI 5 R2.0, that on the odd occasion , the old nagios process is not being killed when we apply a new config.
It happened again today.
Code: Select all
# ps -ef | grep nagios.cfg
nagios 6687 1 4 10:10 ? 00:08:48 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 6757 6687 0 10:10 ? 00:00:05 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 15447 1 6 Nov02 ? 03:23:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 15516 15447 0 Nov02 ? 00:01:44 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
The only reason I notice is that Mod-Gearman throws up a couple of hundred Host Down Critical alerts with the message
Code: Select all
Info: (host check orphaned, is the mod-gearman worker on queue hostgroup_XXX running?)
So far there has been no pattern to when the old process won't die. It happened last week and we have applied at least 3-4 configs a days since then.
I have not noticed this on any of the previous versions of NagiosXI. Are there any known reasons why the nagios service won't die ?
Regards.. Fred
Re: Multiple Nagios Processes
Posted: Wed Nov 04, 2015 10:12 am
by jdalrymple
The init script doesn't try terribly hard to kill off nagios in the event that long-running plugins fail to finish in a timely manner. Best fix is to replace the stop section of your init file:
Code: Select all
stop)
echo -n "Stopping nagios:"
pid_nagios
killproc_nagios TERM
# now we have to wait for nagios to exit and remove its
# own NagiosRunFile, otherwise a following "start" could
# happen, and then the exiting nagios will remove the
# new NagiosRunFile, allowing multiple nagios daemons
# to (sooner or later) run - John Sellens
#echo -n 'Waiting for nagios to exit .'
for i in {1..91} ; do
if status_nagios > /dev/null; then
echo -n '.'
sleep 1
else
break
fi
done
if status_nagios > /dev/null; then
echo ''
echo 'Warning - nagios did not exit in a timely manner - killing forcefully'
killproc_nagios KILL
else
echo ' done.'
fi
rm -f $NagiosStatusFile $NagiosRunFile $NagiosLockDir/$NagiosLockFile $NagiosCommandFile
;;
Re: Multiple Nagios Processes
Posted: Wed Nov 04, 2015 10:30 pm
by Fred Kroeger
Thanks for this. Will it be added to the Nagios distribution so that these changes don't get blown away with any future updates?
F
Re: Multiple Nagios Processes
Posted: Thu Nov 05, 2015 2:58 am
by WillemDH
Made a check for this as we also have once every 3-4 weeks this problem.
Code: Select all
#!/bin/bash
# Script name:eck_nagios_instances.sh
# Version: v0.6.2
# Author: Willem D'Haese, Michiel Detailleur
# Created on: 02/06/2015
# Purpose: Bash nagios plugin to alert if multiple instances of nagios are running in order to prevent orphaned checks
# Recent history:
# 02/06/2015 => Creation date
# Copyright:
# This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published
# by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed
# in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
# PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public
# License along with this program. If not, see <http://www.gnu.org/licenses/>.
NumNagiosProc=`/usr/bin/pgrep -P 1 nagios | wc -l`
if (( $NumNagiosProc < 1)) ;then
echo "No Nagios processes found.."
exit 2
elif (( $NumNagiosProc > 1 )) ;then
echo "$NumNagiosProc Nagios processes found. Please execute 'killall -9 nagios && service nagios restart'"
exit 2
else
echo "OK: One Nagios process found."
fi
Will try the proposed fix from JR.
Re: Multiple Nagios Processes
Posted: Thu Nov 05, 2015 9:49 am
by jdalrymple
Fred Kroeger wrote:Thanks for this. Will it be added to the Nagios distribution so that these changes don't get blown away with any future updates?
F
My fix is just that - *my* fix. It's pretty barbaric, but at the same time I'm not a programmer and don't know any magic to go back in and kill off the lagging subprocesses/workers so that Nagios can die peacefully. This init script needs work and additionally we need to develop a solid systemd script for future EL versions. I have one of those too, but again it's *mine* and I'd never package into software that I charged money for
All that said, I'll shoot my 2 line patch over to the devs and see if it's something they care to make a part of the outgoing production code. As hacky as it is, it still beats having multiple nagios daemons running.
Re: Multiple Nagios Processes
Posted: Fri Nov 06, 2015 2:00 am
by Fred Kroeger
Thanks for sharing your fix. I hope the devs take it up and improve the init script.
Re: Multiple Nagios Processes
Posted: Fri Nov 06, 2015 12:02 pm
by rkennedy
I have created a bug report for this @
https://github.com/NagiosEnterprises/na ... /issues/89 - feel free to add any input you think would help the devs.
As a bug report has been filed, and a quick fix posted. Do you mind if I close this thread?
Re: Multiple Nagios Processes
Posted: Mon Nov 09, 2015 12:45 am
by Fred Kroeger
All good to close