Multiple Nagios Processes

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
Fred Kroeger
Posts: 588
Joined: Wed Oct 19, 2011 11:36 pm
Location: Perth, Western Australia
Contact:

Multiple Nagios Processes

Post by Fred Kroeger »

I've noticed that since upgrading to NagiosXI 5 R2.0, that on the odd occasion , the old nagios process is not being killed when we apply a new config.
It happened again today.

Code: Select all

# ps -ef | grep nagios.cfg
nagios    6687     1  4 10:10 ?        00:08:48 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios    6757  6687  0 10:10 ?        00:00:05 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios   15447     1  6 Nov02 ?        03:23:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios   15516 15447  0 Nov02 ?        00:01:44 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
The only reason I notice is that Mod-Gearman throws up a couple of hundred Host Down Critical alerts with the message

Code: Select all

Info: (host check orphaned, is the mod-gearman worker on queue hostgroup_XXX running?)
So far there has been no pattern to when the old process won't die. It happened last week and we have applied at least 3-4 configs a days since then.
I have not noticed this on any of the previous versions of NagiosXI. Are there any known reasons why the nagios service won't die ?

Regards.. Fred
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: Multiple Nagios Processes

Post by jdalrymple »

The init script doesn't try terribly hard to kill off nagios in the event that long-running plugins fail to finish in a timely manner. Best fix is to replace the stop section of your init file:

Code: Select all

	stop)
		echo -n "Stopping nagios:"

		pid_nagios
		killproc_nagios TERM

		# now we have to wait for nagios to exit and remove its
		# own NagiosRunFile, otherwise a following "start" could
		# happen, and then the exiting nagios will remove the
		# new NagiosRunFile, allowing multiple nagios daemons
		# to (sooner or later) run - John Sellens
		#echo -n 'Waiting for nagios to exit .'
		for i in {1..91} ; do
			if status_nagios > /dev/null; then
				echo -n '.'
				sleep 1
			else
				break
			fi
		done
		if status_nagios > /dev/null; then
			echo ''
			echo 'Warning - nagios did not exit in a timely manner - killing forcefully'
			killproc_nagios KILL
		else
			echo ' done.'
		fi

		rm -f $NagiosStatusFile $NagiosRunFile $NagiosLockDir/$NagiosLockFile $NagiosCommandFile
		;;
Fred Kroeger
Posts: 588
Joined: Wed Oct 19, 2011 11:36 pm
Location: Perth, Western Australia
Contact:

Re: Multiple Nagios Processes

Post by Fred Kroeger »

Thanks for this. Will it be added to the Nagios distribution so that these changes don't get blown away with any future updates?

F
User avatar
WillemDH
Posts: 2320
Joined: Wed Mar 20, 2013 5:49 am
Location: Ghent
Contact:

Re: Multiple Nagios Processes

Post by WillemDH »

Made a check for this as we also have once every 3-4 weeks this problem.

Code: Select all

#!/bin/bash

# Script name:eck_nagios_instances.sh
# Version: v0.6.2
# Author: Willem D'Haese, Michiel Detailleur
# Created on: 02/06/2015
# Purpose: Bash nagios plugin to alert if multiple instances of nagios are running in order to prevent orphaned checks
# Recent history:
#       02/06/2015 => Creation date
# Copyright:
#       This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published
#       by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed
#       in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
#       PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public
#       License along with this program.  If not, see <http://www.gnu.org/licenses/>.

NumNagiosProc=`/usr/bin/pgrep -P 1 nagios | wc -l`

if (( $NumNagiosProc < 1)) ;then
        echo "No Nagios processes found.."
        exit 2
elif (( $NumNagiosProc > 1 )) ;then
        echo "$NumNagiosProc Nagios processes found. Please execute 'killall -9 nagios && service nagios restart'"
        exit 2
else
        echo "OK: One Nagios process found."
fi
Will try the proposed fix from JR.
Nagios XI 5.8.1
https://outsideit.net
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: Multiple Nagios Processes

Post by jdalrymple »

Fred Kroeger wrote:Thanks for this. Will it be added to the Nagios distribution so that these changes don't get blown away with any future updates?

F
My fix is just that - *my* fix. It's pretty barbaric, but at the same time I'm not a programmer and don't know any magic to go back in and kill off the lagging subprocesses/workers so that Nagios can die peacefully. This init script needs work and additionally we need to develop a solid systemd script for future EL versions. I have one of those too, but again it's *mine* and I'd never package into software that I charged money for :D

All that said, I'll shoot my 2 line patch over to the devs and see if it's something they care to make a part of the outgoing production code. As hacky as it is, it still beats having multiple nagios daemons running.
Fred Kroeger
Posts: 588
Joined: Wed Oct 19, 2011 11:36 pm
Location: Perth, Western Australia
Contact:

Re: Multiple Nagios Processes

Post by Fred Kroeger »

Thanks for sharing your fix. I hope the devs take it up and improve the init script.
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Multiple Nagios Processes

Post by rkennedy »

I have created a bug report for this @ https://github.com/NagiosEnterprises/na ... /issues/89 - feel free to add any input you think would help the devs.

As a bug report has been filed, and a quick fix posted. Do you mind if I close this thread?
Former Nagios Employee
Fred Kroeger
Posts: 588
Joined: Wed Oct 19, 2011 11:36 pm
Location: Perth, Western Australia
Contact:

Re: Multiple Nagios Processes

Post by Fred Kroeger »

All good to close
Locked