Page 1 of 2

Nagios 4.2.0 reload stop nagios (bug?)

Posted: Thu Aug 04, 2016 10:51 am
by briancolman
Hi everyone,
Yesterday update my instalation of Nagios Core to 4.2.0, after that, when we use reload with the init script, this stop the Nagios, but stop in a way that cant be start again before make stop with the init script. :shock: No error on log, only the shutdown sequence
Some else have the same issue? :?:

Re: Nagios 4.2.0 reload stop nagios (bug?)

Posted: Thu Aug 04, 2016 1:42 pm
by tgriep
Can you run the following command to verify the configuration files and post the errors here?

Code: Select all

/usr/local/nagios/bin/nagios -vv /usr/local/nagios/etc/nagios.cfg
Can you post how you are starting, reloading and stopping the nagios process and the output when you try doing that?

Re: Nagios 4.2.0 reload stop nagios (bug?)

Posted: Tue Aug 09, 2016 4:35 am
by jpcozar
Hi,
Same problem here after upgrading from Nagios Core 4.1.1 to Nagios 4.2.0
Nagios is running nicely:

Code: Select all

ps -ef|grep nagios

nagios   10074     1  0 ago08 ?        00:04:17 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios   10075 10074  0 ago08 ?        00:00:23 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   10076 10074  0 ago08 ?        00:00:23 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   10077 10074  0 ago08 ?        00:00:23 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   10078 10074  0 ago08 ?        00:00:23 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   10079 10074  0 ago08 ?        00:00:20 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   10080 10074  0 ago08 ?        00:00:23 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
Init script says it too:

Code: Select all

sudo service nagios status
nagios (pid 10074) is running...
Now I do a change in configuration of a *.cfg file and verify configuration is ok:

Code: Select all

/usr/local/nagios/bin/nagios -vv /usr/local/nagios/etc/nagios.cfg
Nagios Core 4.2.0
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-01-2016
License: GPL

Website: https://www.nagios.org
Reading configuration data...
   Read main config file okay...
Processing object config file '/usr/local/nagios/etc/objects/commands.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/contacts.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/timeperiods.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/templates.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/localhost.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/windows_CEMCOR.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/windows_SAE.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/switch_RED_ES_CEICE.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/switch_RED_ES_SAE.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/switch_CEMCOR.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/switch_SAE.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/switch_CEIC.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/accesspoints.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/router.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/ups.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/nas.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/terminal.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/windows-virtuales.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/windows-gefoc.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/vmware.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/linux.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/libreria.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/qmatic_SAE.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/SAC_NemoQ.cfg'...
   Read object config files okay...

Running pre-flight check on configuration data...

Checking objects...
	Checked 738 services.
	Checked 174 hosts.
	Checked 27 host groups.
	Checked 7 service groups.
	Checked 2 contacts.
	Checked 1 contact groups.
	Checked 69 commands.
	Checked 6 time periods.
	Checked 0 host escalations.
	Checked 0 service escalations.
Checking for circular paths...
	Checked 174 hosts
	Checked 0 service dependencies
	Checked 0 host dependencies
	Checked 6 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check
So I reload nagios:

Code: Select all

 sudo service nagios reload
Running configuration check...
Reloading nagios configuration...
done
It seems everything is ok but if I check it:

Code: Select all

sudo service nagios status
nagios is not running
And no PID associate...nagios has been killed and I have to start it manually again :cry:

Code: Select all

ps -ef|grep nagios
jpcozar   4477 29264  0 11:33 pts/5    00:00:00 grep --color=auto nagios
What's wrong with nagios reloading in 4..2.0 version ? Thank you in advance

Re: Nagios 4.2.0 reload stop nagios (bug?)

Posted: Tue Aug 09, 2016 11:32 am
by tgriep
Can you login to the Nagios server as root, run the following commands and post the output?

Code: Select all

ls -l /usr/local/nagios/sbin
ls -l /usr/local/nagios/var
head -n 1 /usr/local/nagios/var/nagios.lock
Also, post the /etc/init.d/nagios file here so we can view it.

Re: Nagios 4.2.0 reload stop nagios (bug?)

Posted: Tue Aug 09, 2016 8:07 pm
by Box293
Did you also upgrade to the latest version of NDO? Core 4.2.0 won't load without this.

https://support.nagios.com/forum/viewto ... hilit=+ndo

Re: Nagios 4.2.0 reload stop nagios (bug?)

Posted: Wed Aug 10, 2016 2:16 am
by jpcozar
Yes. Ndo was updated from 2.0.1 to 2.1.0. because Nagios 4.2.0 didnt run other way and it was clearly showed in nagios.log.
Now the problem is just with reloading....starting and stopping is working fine.
Indeed, today I just did a reload without doing any configuration change and nagios is stopped :-(
nagios.log only shows that:

Code: Select all

[1470813111] Caught SIGSEGV, shutting down...
My OS is Ubuntu 14.04.5 LTS updated if it can helps.

Re: Nagios 4.2.0 reload stop nagios (bug?)

Posted: Wed Aug 10, 2016 2:25 am
by jpcozar
tgriep wrote:Can you login to the Nagios server as root, run the following commands and post the output?

Code: Select all

ls -l /usr/local/nagios/sbin
ls -l /usr/local/nagios/var
head -n 1 /usr/local/nagios/var/nagios.lock
Also, post the /etc/init.d/nagios file here so we can view it.
Here it is the info that was asked for:

ls -l /usr/local/nagios/sbin

Code: Select all

ls -l /usr/local/nagios/sbin
total 5184
-rwxrwxr-x 1 nagios nagcmd 320616 ago  4 11:16 archivejson.cgi
-rwxrwxr-x 1 nagios nagcmd 306008 ago  4 11:16 avail.cgi
-rwxrwxr-x 1 nagios nagcmd 300336 ago  4 11:16 cmd.cgi
-rwxrwxr-x 1 nagios nagcmd 273160 ago  4 11:16 config.cgi
-rwxrwxr-x 1 nagios nagios  17272 ago  6  2010 downtime_sched.cgi
-rwxrwxr-x 1 nagios nagcmd 314168 ago  4 11:16 extinfo.cgi
-rwxrwxr-x 1 nagios nagcmd 265160 ago  4 11:16 histogram.cgi
-rwxrwxr-x 1 nagios nagcmd 244536 ago  4 11:16 history.cgi
-rwxrwxr-x 1 nagios nagcmd 244528 ago  4 11:16 notifications.cgi
-rwxrwxr-x 1 nagios nagcmd 322408 ago  4 11:16 objectjson.cgi
-rwxrwxr-x 1 nagios nagcmd 236296 ago  4 11:16 outages.cgi
-rwxrwxr-x 1 nagios nagcmd 240408 ago  4 11:16 showlog.cgi
-rwxrwxr-x 1 nagios nagcmd 314192 ago  4 11:16 status.cgi
-rwxrwxr-x 1 nagios nagcmd 316456 ago  4 11:16 statusjson.cgi
-rwxrwxr-x 1 nagios nagcmd 265216 ago  4 11:16 statusmap.cgi
-rwxrwxr-x 1 nagios nagcmd 256848 ago  4 11:16 statuswml.cgi
-rwxrwxr-x 1 nagios nagcmd 244528 ago  4 11:16 statuswrl.cgi
-rwxrwxr-x 1 nagios nagcmd 265056 ago  4 11:16 summary.cgi
-rwxrwxr-x 1 nagios nagcmd 256864 ago  4 11:16 tac.cgi
-rwxrwxr-x 1 nagios nagcmd 273352 ago  4 11:16 trends.cgi
ls -l /usr/local/nagios/var

Code: Select all

 ls -l /usr/local/nagios/var
total 8684
drwxrwxr-x 3 nagios nagcmd    106496 ago  8 23:59 archives
-rw-r--r-- 1 nagios nagios         0 ago 10 09:19 check_vmfs.err
-rw-r--r-- 1 nagios nagios       319 abr  9  2013 check_vmfs.std
-rw-r--r-- 1 nagios nagios      1834 jul 25  2014 configcheck17063
-rw-r--r-- 1 nagios nagios      1834 jul 25  2014 configcheck17403
-rw-r--r-- 1 nagios nagios      1470 jul 25  2014 configcheck22503
-rw-r--r-- 1 nagios nagios      1470 jul 25  2014 configcheck22557
-rw-r--r-- 1 nagios nagios      1470 jul 25  2014 configcheck23280
-rw-r--r-- 1 nagios nagios      1470 jul 25  2014 configcheck23327
-rw-r--r-- 1 nagios nagios      1470 jul 25  2014 configcheck28974
-rw-r--r-- 1 nagios nagios       255 ago 10 09:01 downtime.log
-rw-r--r-- 1 nagios nagcmd        34 ago 10 09:14 nagios.configtest
-rw-r--r-- 1 nagios nagcmd         6 ago 10 09:13 nagios.lock
-rw-r--r-- 1 nagios nagios    807797 ago 10 09:20 nagios.log
-rw-rw-r-- 1 nagios nagios   1130941 jul  2  2015 nagios.tmpHNaBt9
-rw-rw-r-- 1 nagios nagios   1140674 sep  7  2015 nagios.tmpHtGhGz
-rw-rw-r-- 1 nagios nagios   1312558 may  8  2015 nagios.tmpQKNwoj
-rw-r--r-- 1 nagios nagios         2 ago 10 08:25 ndo2db.lock
-rw-r--r-- 1 nagios nagios         0 ago 10 09:11 ndomod.tmp
srwxr-xr-x 1 nagios nagios         0 ago 10 08:25 ndo.sock
-rw-r--r-- 1 nagios nagios    912373 ago 10 09:13 objects.cache
-rw-r--r-- 1 nagios nagios    912373 ago 10 09:14 objects.precache
-rw------- 1 nagios nagcmd   1245882 ago 10 09:13 retention.dat
drwxrwsr-x 2 nagios www-data    4096 ago 10 09:16 rw
drwxr-xr-x 3 nagios nagios      4096 ago  5  2010 spool
-rw-rw-r-- 1 nagios nagios   1242172 ago 10 09:20 status.dat
]head -n 1 /usr/local/nagios/var/nagios.lock

Code: Select all

head -n 1 /usr/local/nagios/var/nagios.lock
11859
/etc/init.d/nagios

Code: Select all

cat /etc/init.d/nagios
#!/bin/sh
#
# chkconfig: 345 99 01
# description: Nagios network monitor
# processname: nagios
# File : nagios
#
# Author : Jorge Sanchez Aymar ([email protected])
#
# Changelog :
#
# 1999-07-09 Karl DeBisschop <[email protected]>
#  - setup for autoconf
#  - add reload function
# 1999-08-06 Ethan Galstad <[email protected]>
#  - Added configuration info for use with RedHat's chkconfig tool
#    per Fran Boon's suggestion
# 1999-08-13 Jim Popovitch <[email protected]>
#  - added variable for nagios/var directory
#  - cd into nagios/var directory before creating tmp files on startup
# 1999-08-16 Ethan Galstad <[email protected]>
#  - Added test for rc.d directory as suggested by Karl DeBisschop
# 2000-07-23 Karl DeBisschop <[email protected]>
#  - Clean out redhat macros and other dependencies
# 2003-01-11 Ethan Galstad <[email protected]>
#  - Updated su syntax (Gary Miller)
#
# Description: Starts and stops the Nagios monitor
#              used to provide network services status.
#
### BEGIN INIT INFO
# Provides:		nagios
# Required-Start:	$local_fs $syslog $network
# Required-Stop:	$local_fs $syslog $network
# Default-Start:	2 3 4 5
# Default-Stop:		0 1 6
# Short-Description:	Starts and stops the Nagios monitoring server
# Description:		Starts and stops the Nagios monitoring server
### END INIT INFO

# Our install-time configuration.
prefix=/usr/local/nagios
exec_prefix=${prefix}
NagiosBin=${exec_prefix}/bin/nagios
NagiosCfgFile=${prefix}/etc/nagios.cfg
NagiosCfgtestFile=${prefix}/var/nagios.configtest
NagiosStatusFile=${prefix}/var/status.dat
NagiosRetentionFile=${prefix}/var/retention.dat
NagiosCommandFile=${prefix}/var/rw/nagios.cmd
NagiosVarDir=${prefix}/var
NagiosRunFile=${prefix}/var/nagios.lock
NagiosLockDir=/var/lock/subsys
NagiosLockFile=nagios
NagiosCGIDir=${exec_prefix}/sbin
NagiosUser=nagios
NagiosGroup=nagcmd
checkconfig="true"

# Source function library
# Some *nix do not have an rc.d directory, so do a test first
if [ -f /etc/rc.d/init.d/functions ]; then
	. /etc/rc.d/init.d/functions
elif [ -f /etc/init.d/functions ]; then
	. /etc/init.d/functions
elif [ -f /lib/lsb/init-functions ]; then
	. /lib/lsb/init-functions
fi

# Load any extra environment variables for Nagios and its plugins.
if test -f /etc/sysconfig/nagios; then
	. /etc/sysconfig/nagios
fi

# Automate addition of RAMDISK based on environment variables
USE_RAMDISK=${USE_RAMDISK:-0}
if test "$USE_RAMDISK" -ne 0 && test "$RAMDISK_SIZE"X != "X"; then
	ramdisk=`mount |grep "${RAMDISK_DIR} type tmpfs"`
	if [ "$ramdisk"X == "X" ]; then
		mkdir -p -m 0755 ${RAMDISK_DIR}
		mount -t tmpfs -o size=${RAMDISK_SIZE}m tmpfs ${RAMDISK_DIR}
		mkdir -p -m 0755 ${RAMDISK_DIR}/checkresults
		chown -R $NagiosUser:$NagiosGroup ${RAMDISK_DIR}
	fi
fi


check_config ()
{
	TMPFILE=$(mktemp /tmp/.configtest.XXXXXXXX)
	$NagiosBin -vp $NagiosCfgFile > "$TMPFILE"
	WARN=`grep ^"Total Warnings:" "$TMPFILE" |awk -F: '{print \$2}' |sed s/' '//g`
	ERR=`grep ^"Total Errors:" "$TMPFILE" |awk -F: '{print \$2}' |sed s/' '//g`

	if test "$WARN" = "0" && test "${ERR}" = "0"; then
		echo "OK - Configuration check verified" > $NagiosCfgtestFile
		chmod 0644 $NagiosCfgtestFile
		chown $NagiosUser:$NagiosGroup $NagiosCfgtestFile
		/bin/rm "$TMPFILE"
		return 0
	elif test "${ERR}" = "0"; then
		# Write the errors to a file we can have a script watching for.
		echo "WARNING: Warnings in config files - see log for details: $NagiosCfgtestFile" > $NagiosCfgtestFile
		egrep -i "(^warning|^error)" "$TMPFILE" >> $NagiosCfgtestFile
		chmod 0644 $NagiosCfgtestFile
		chown $NagiosUser:$NagiosGroup $NagiosCfgtestFile
		/bin/rm "$TMPFILE"
		return 0
	else
		# Write the errors to a file we can have a script watching for.
		echo "ERROR: Errors in config files - see log for details: $NagiosCfgtestFile" > $NagiosCfgtestFile
		egrep -i "(^warning|^error)" "$TMPFILE" >> $NagiosCfgtestFile
		chmod 0644 $NagiosCfgtestFile
		chown $NagiosUser:$NagiosGroup $NagiosCfgtestFile
		cat "$TMPFILE"
		exit 8
	fi
}


status_nagios ()
{
	if test -x $NagiosCGI/daemonchk.cgi; then
		if $NagiosCGI/daemonchk.cgi -l $NagiosRunFile > /dev/null 2>&1; then return 0; fi
	else
		if ps -p $NagiosPID > /dev/null 2>&1; then return 0; fi
	fi

	return 1
}

printstatus_nagios ()
{
	if status_nagios; then
		echo "nagios (pid $NagiosPID) is running..."
	else
		echo "nagios is not running"
	fi
}

killproc_nagios ()
{
	kill -s "$1" $NagiosPID
}

pid_nagios ()
{
	if test ! -f $NagiosRunFile; then
		echo "No lock file found in $NagiosRunFile"
		exit 1
	fi

	NagiosPID=`head -n 1 $NagiosRunFile`
}



# Check that nagios exists.
if [ ! -f $NagiosBin ]; then
    echo "Executable file $NagiosBin not found. Exiting."
    exit 1
fi

# Check that nagios.cfg exists.
if [ ! -f $NagiosCfgFile ]; then
    echo "Configuration file $NagiosCfgFile not found. Exiting."
    exit 1
fi

# See how we were called.
case "$1" in

	start)
		echo -n "Starting nagios:"

		if test "$checkconfig" = "true"; then
			check_config
			# check_config exits on configuration errors.
		fi

		if test -f $NagiosRunFile; then
			NagiosPID=`head -n 1 $NagiosRunFile`
			if status_nagios; then
				echo " another instance of nagios is already running."
				exit 0
			fi
		fi

		touch $NagiosVarDir/nagios.log $NagiosRetentionFile
		rm -f $NagiosCommandFile
		touch $NagiosRunFile
		chown $NagiosUser:$NagiosGroup $NagiosRunFile $NagiosVarDir/nagios.log $NagiosRetentionFile
		$NagiosBin -d $NagiosCfgFile
		if [ -d $NagiosLockDir ]; then touch $NagiosLockDir/$NagiosLockFile; fi

		echo " done."
		;;

	stop)
		echo -n "Stopping nagios:"

		pid_nagios
		killproc_nagios TERM

		# now we have to wait for nagios to exit and remove its
		# own NagiosRunFile, otherwise a following "start" could
		# happen, and then the exiting nagios will remove the
		# new NagiosRunFile, allowing multiple nagios daemons
		# to (sooner or later) run - John Sellens
		#echo -n 'Waiting for nagios to exit .'
		for i in {1..90}; do
			if status_nagios > /dev/null; then
				echo -n '.'
				sleep 1
			else
				break
			fi
		done
		if status_nagios > /dev/null; then
			echo ''
			echo 'Warning - nagios did not exit in a timely manner - Killing it!'
			killproc_nagios KILL
		else
			echo ' done.'
		fi

		rm -f $NagiosStatusFile $NagiosRunFile $NagiosLockDir/$NagiosLockFile $NagiosCommandFile
		;;

	status)
		pid_nagios
		printstatus_nagios
		;;

	checkconfig)
		if test "$checkconfig" = "true"; then
			printf "Running configuration check...\n"
			check_config
		fi

		if [ $? -eq 0 ]; then
			echo " OK."
		else
			echo " CONFIG ERROR!  Check your Nagios configuration."
			exit 1
		fi
		;;

	restart)
		if test "$checkconfig" = "true"; then
			printf "Running configuration check...\n"
			check_config
		fi

		$0 stop
		$0 start
		;;

	reload|force-reload)
		if test "$checkconfig" = "true"; then
			printf "Running configuration check...\n"
			check_config
		fi

		if test ! -f $NagiosRunFile; then
			$0 start
		else
			pid_nagios
			if status_nagios > /dev/null; then
				printf "Reloading nagios configuration...\n"
				killproc_nagios HUP
				echo "done"
			else
				$0 stop
				$0 start
			fi
		fi
		;;

	configtest)
		$NagiosBin -vp $NagiosCfgFile
		;;

	*)
		echo "Usage: nagios {start|stop|restart|reload|force-reload|status|checkconfig|configtest}"
		exit 1
		;;

esac

# End of this script

Re: Nagios 4.2.0 reload stop nagios (bug?)

Posted: Wed Aug 10, 2016 2:40 pm
by tgriep
Can you edit the /etc/init.d/nagios file and edit line 53 and change it from

Code: Select all

NagiosLockDir=/var/lock/subsys
to

Code: Select all

NagiosLockDir=/usr/local/nagiosxi/var/subsys
Save the file out and post back the results.

Re: Nagios 4.2.0 reload stop nagios (bug?)

Posted: Wed Aug 10, 2016 4:36 pm
by jfrickson
I'm working on it, but it's a real sneaky issue. For now, just do restarts instead of reloads until I get it figured out.

Re: Nagios 4.2.0 reload stop nagios (bug?)

Posted: Thu Aug 11, 2016 4:55 am
by jpcozar
tgriep wrote:Can you edit the /etc/init.d/nagios file and edit line 53 and change it from

Code: Select all

NagiosLockDir=/var/lock/subsys
to

Code: Select all

NagiosLockDir=/usr/local/nagiosxi/var/subsys
Save the file out and post back the results.
I did that (replacing nagiosxi by nagios because I have Nagios Core installed) and nagios process is still being killed during reload.
So I changed back again that line from:

Code: Select all

NagiosLockDir=/usr/local/nagios/var/subsys
to

Code: Select all

NagiosLockDir=/var/lock/subsys
As I had problems for starting Nagios 4.2.0 when I upgraded from Nagios 4.1.1 because ndo wasnt updated and it didnt run until I upgraded to ndo 2.1.0 version, I have disabled it temporally and reloading is working without ndo.
In nagios.cfg I have commented out this line:

Code: Select all

 broker_module=/usr/local/nagios/bin/ndomod.o config_file=/usr/local/nagios/etc/ndomod.cfg  
This is nagios.log when ndo is enabled in nagios.cfg and I start nagios [1470907868] and later I do a reload [1470907907]. nagios process is killed then.

Code: Select all

[1470907868] Nagios 4.2.0 starting... (PID=26381)
[1470907868] Local time is Thu Aug 11 11:31:08 CEST 2016
[1470907868] LOG VERSION: 2.0
[1470907868] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1470907868] qh: core query handler registered
[1470907868] nerd: Channel hostchecks registered successfully
[1470907868] nerd: Channel servicechecks registered successfully
[1470907868] nerd: Channel opathchecks registered successfully
[1470907868] nerd: Fully initialized and ready to rock!
[1470907868] wproc: Successfully registered manager as @wproc with query handler
[1470907868] wproc: Registry request: name=Core Worker 26382;pid=26382
[1470907868] wproc: Registry request: name=Core Worker 26383;pid=26383
[1470907868] wproc: Registry request: name=Core Worker 26385;pid=26385
[1470907868] wproc: Registry request: name=Core Worker 26387;pid=26387
[1470907868] wproc: Registry request: name=Core Worker 26384;pid=26384
[1470907868] wproc: Registry request: name=Core Worker 26386;pid=26386
[1470907868] ndomod: NDOMOD 2.1 (07-28-2016) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
[1470907868] ndomod: Could not open data sink!  I'll keep trying, but some output may get lost...
[1470907868] ndomod registered for process data
[1470907868] ndomod registered for timed event data
[1470907868] ndomod registered for log data'
[1470907868] ndomod registered for system command data'
[1470907868] ndomod registered for event handler data'
[1470907868] ndomod registered for notification data'
[1470907868] ndomod registered for service check data'
[1470907868] ndomod registered for host check data'
[1470907868] ndomod registered for comment data'
[1470907868] ndomod registered for downtime data'
[1470907868] ndomod registered for flapping data'
[1470907868] ndomod registered for program status data'
[1470907868] ndomod registered for host status data'
[1470907868] ndomod registered for service status data'
[1470907868] ndomod registered for adaptive program data'
[1470907868] ndomod registered for adaptive host data'
[1470907868] ndomod registered for adaptive service data'
[1470907868] ndomod registered for external command data'
[1470907868] ndomod registered for aggregated status data'
[1470907868] ndomod registered for retention data'
[1470907868] ndomod registered for contact data'
[1470907868] ndomod registered for contact notification data'
[1470907868] ndomod registered for acknowledgement data'
[1470907868] ndomod registered for state change data'
[1470907868] ndomod registered for contact status data'
[1470907868] ndomod registered for adaptive contact data'
[1470907868] Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
[1470907868] Successfully launched command file worker with pid 26388
[1470907884] ndomod: Still unable to connect to data sink.  1538 items lost, 5000 queued items to flush.
[1470907900] ndomod: Still unable to connect to data sink.  1828 items lost, 5000 queued items to flush.
[1470907907] Caught SIGHUP, restarting...
[1470907907] Event broker module 'NERD' deinitialized successfully.
[1470907907] ndomod: Shutdown complete.
[1470907907] Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
[1470907907] Nagios 4.2.0 starting... (PID=26381)
[1470907907] Local time is Thu Aug 11 11:31:47 CEST 2016
[1470907907] LOG VERSION: 2.0
[1470907907] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1470907907] qh: core query handler registered
[1470907907] nerd: Channel hostchecks registered successfully
[1470907907] nerd: Channel servicechecks registered successfully
[1470907907] nerd: Channel opathchecks registered successfully
[1470907907] nerd: Fully initialized and ready to rock!
[1470907907] wproc: Successfully registered manager as @wproc with query handler
[1470907907] wproc: Registry request: name=Core Worker 26667;pid=26667
[1470907907] wproc: Registry request: name=Core Worker 26668;pid=26668
[1470907907] wproc: Registry request: name=Core Worker 26670;pid=26670
[1470907907] wproc: Registry request: name=Core Worker 26669;pid=26669
[1470907907] wproc: Registry request: name=Core Worker 26671;pid=26671
[1470907907] wproc: Registry request: name=Core Worker 26672;pid=26672
[1470907907] ndomod: NDOMOD 2.1 (07-28-2016) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
[1470907907] ndomod: Could not open data sink!  I'll keep trying, but some output may get lost...
[1470907907] ndomod registered for process data
[1470907907] ndomod registered for timed event data
[1470907907] ndomod registered for log data'
[1470907907] ndomod registered for system command data'
[1470907907] ndomod registered for event handler data'
[1470907907] ndomod registered for notification data'
[1470907907] ndomod registered for service check data'
[1470907907] ndomod registered for host check data'
[1470907907] ndomod registered for comment data'
[1470907907] ndomod registered for downtime data'
[1470907907] ndomod registered for flapping data'
[1470907907] ndomod registered for program status data'
[1470907907] ndomod registered for host status data'
[1470907907] ndomod registered for service status data'
[1470907907] ndomod registered for adaptive program data'
[1470907907] ndomod registered for adaptive host data'
[1470907907] ndomod registered for adaptive service data'
[1470907907] ndomod registered for external command data'
[1470907907] ndomod registered for aggregated status data'
[1470907907] ndomod registered for retention data'
[1470907907] ndomod registered for contact data'
[1470907907] ndomod registered for contact notification data'
[1470907907] ndomod registered for acknowledgement data'
[1470907907] ndomod registered for state change data'
[1470907907] ndomod registered for contact status data'
[1470907907] ndomod registered for adaptive contact data'
[1470907907] Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
[1470907907] Caught SIGSEGV, shutting down...
And this is nagios.log when nod is disabled in nagios.cfg and I start nagios [1470908061] and then I do a reload [1470908106]

Code: Select all

[1470908061] nerd: Fully initialized and ready to rock!
[1470908061] wproc: Successfully registered manager as @wproc with query handler
[1470908061] wproc: Registry request: name=Core Worker 26722;pid=26722
[1470908061] wproc: Registry request: name=Core Worker 26723;pid=26723
[1470908061] wproc: Registry request: name=Core Worker 26724;pid=26724
[1470908061] wproc: Registry request: name=Core Worker 26726;pid=26726
[1470908061] wproc: Registry request: name=Core Worker 26725;pid=26725
[1470908061] wproc: Registry request: name=Core Worker 26727;pid=26727
[1470908061] Successfully launched command file worker with pid 26728
[1470908071] SERVICE ALERT: svr_SAE_Valdeolleros_W2012_R2;PING;WARNING;SOFT;1;ECO WARNING - Paquetes perdidos = 0%, RTA = 429.09 ms
[1470908098] wproc: Core Worker 26726: job 3 (pid=26764) timed out. Killing it
[1470908098] wproc: CHECK job 3 from worker Core Worker 26726 timed out after 30.01s
[1470908098] wproc:   host=LIB_CEMCOR_Tomas_Aquino; service=(null);
[1470908098] wproc:   early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1470908098] Warning: Check of host 'LIB_CEMCOR_Tomas_Aquino' timed out after 30.01 seconds
[1470908098] HOST ALERT: LIB_CEMCOR_Tomas_Aquino;DOWN;SOFT;1;(Host check timed out after 30.01 seconds)
[1470908098] wproc: Core Worker 26726: job 3 (pid=26764): Dormant child reaped
[1470908105] SERVICE ALERT: svrvm02;WIN-UPDATES;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 30 seconds.
[1470908106] Caught SIGHUP, restarting...
[1470908106] Event broker module 'NERD' deinitialized successfully.
[1470908106] Nagios 4.2.0 starting... (PID=26721)
[1470908106] Local time is Thu Aug 11 11:35:06 CEST 2016
[1470908106] LOG VERSION: 2.0
[1470908106] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1470908106] qh: core query handler registered
[1470908106] nerd: Channel hostchecks registered successfully
[1470908106] nerd: Channel servicechecks registered successfully
[1470908106] nerd: Channel opathchecks registered successfully
[1470908106] nerd: Fully initialized and ready to rock!
[1470908106] wproc: Successfully registered manager as @wproc with query handler
[1470908106] wproc: Registry request: name=Core Worker 27404;pid=27404
[1470908106] wproc: Registry request: name=Core Worker 27406;pid=27406
[1470908106] wproc: Registry request: name=Core Worker 27407;pid=27407
[1470908106] wproc: Registry request: name=Core Worker 27403;pid=27403
[1470908106] wproc: Registry request: name=Core Worker 27405;pid=27405
[1470908106] wproc: Registry request: name=Core Worker 27408;pid=27408
[1470908131] SERVICE ALERT: svr_SAE_Valdeolleros_W2012_R2;PING;WARNING;SOFT;2;ECO WARNING - Paquetes perdidos = 0%, RTA = 525.68 ms
So I think it's something involving ndo module and/or daemon