Page 1 of 2
service start nagios: he control process exited with error c
Posted: Thu Aug 31, 2017 4:01 am
by misja
Hello,
not much Nagios or Debian experience yet so quite helpless.
Nagios core won't start anymore after update from 4.3.2 to 4.3.4:
Warning: nagios.service changed on disk. Run 'systemctl daemon-reload' to reload units.
Job for nagios.service failed because the control process exited with error code.
See "systemctl status nagios.service" and "journalctl -xe" for details.
Run 'systemctl daemon-reload' does not solve the issue I keep getting the error code message.
Nagios on Debian, I installed 4.3.2 from source following:
http://www.miloszengel.com/nagios-core- ... -x-jessie/
I had it working for a while and came awaire of the update so I downloaded new source and compiled following:
https://assets.nagios.com/downloads/nag ... ading.html
I did not do any modifications to the config files other then changing the location of the lock file from /usr/local/nagios/var/nagios.lock to /run/nagios.lock as suggested in the link above.
cmd: /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg gives an all good.
Might be an issue with rights? I installed 4.3.2 as root if I remember well not as user nagios, don't know if that was right but that version ran without issues untill the update.
Any suggestions?
Re: service start nagios: he control process exited with err
Posted: Thu Aug 31, 2017 4:29 pm
by dwasswa
Can you please post your error logs for
So i can track whats failing..
Re: service start nagios: he control process exited with err
Posted: Fri Sep 01, 2017 1:46 pm
by tgriep
Thanks
@Derick Wasswa for the help but we would have to see the output of that command or you can also check the /var/log/messages file for any errors on why nagios is not starting.
Re: service start nagios: he control process exited with err
Posted: Fri Sep 01, 2017 3:47 pm
by misja
Hello Derrick and tgriep,
thanks for the help.
When I look at output of journalctl -xe there is a lot of info but I think the bit that connects to nagios service is as follows (left some extra lines in the beginning as this is end of a cold start I assume nagios is the last to load):
Sep 01 09:43:47 debian-d510 systemd[580]: Startup finished in 593ms.
-- Subject: System start-up is now complete
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- All system services necessary queued for starting at boot have been
-- successfully started. Note that this does not mean that the machine is
-- now idle as services might still be busy with completing start-up.
--
-- Kernel start-up required KERNEL_USEC microseconds.
--
-- Initial RAM disk start-up required INITRD_USEC microseconds.
--
-- Userspace start-up required 593248 microseconds.
Sep 01 09:43:47 debian-d510 systemd[1]: Started User Manager for UID 113.
-- Subject: Unit user@113.service has finished start-up
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit user@113.service has finished starting up.
--
-- The start-up result is done.
Sep 01 09:43:47 debian-d510 su[561]: pam_unix(su:session): session closed for user nagios
Sep 01 09:43:47 debian-d510 nagios[552]: Starting nagios:ERROR: Could not create or update '/usr/local/nagios/var/nagios.configtest'
Sep 01 09:43:47 debian-d510 systemd[1]: nagios.service: Control process exited, code=exited status=8
Sep 01 09:43:47 debian-d510 systemd[1]: Failed to start LSB: Starts and stops the Nagios monitoring server.
-- Subject: Unit nagios.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit nagios.service has failed.
--
-- The result is failed.
Sep 01 09:43:47 debian-d510 systemd[1]: nagios.service: Unit entered failed state.
There seems to be no reference to nagios in /var/log/messages.
Is this enough info or should I include the full output of journalctl -xe?
Re: service start nagios: he control process exited with err
Posted: Fri Sep 01, 2017 4:18 pm
by scottwilkerson
This shouldn't be happening but lets run the following
Code: Select all
touch /usr/local/nagios/var/nagios.configtest
chown nagios:nagios /usr/local/nagios/var/nagios.configtest
chmod ug+rw /usr/local/nagios/var/nagios.configtest
and try starting again
Re: service start nagios: he control process exited with err
Posted: Fri Sep 01, 2017 4:18 pm
by bheden
Did you update the location of the lock file in both the nagios.cfg file AND the init script?
You should be able to just update the lock_file= value in the cfg and then run make install-init from the path that you configured 4.3.4 and it should work.
If that doesn't fix it, send over your current nagios.cfg file and your init.d/nagios script please.
Re: service start nagios: he control process exited with err
Posted: Fri Sep 01, 2017 4:51 pm
by misja
i ran the commands for nagios.configtest (before i did the file did not exist).
then i tried "service nagios start" and got the same error, the file nagios.configtest is gone again.
the init script you talk about is that init.d/nagios? i only editted the nagios.cfg file for the location of the lockfile and might have done that after running install-init so i ran install-init again but no succes. when i look in init.d/nagios it does not have the right location and name for the lock file i think so maybe install-init went wrong?
i included the files you asked for.
Re: service start nagios: he control process exited with err
Posted: Fri Sep 01, 2017 4:56 pm
by misja
i try again
somehow one file missing (or is it not possible to upload two files?) here is the next one inline.
Code: Select all
#!/bin/sh
#
# chkconfig: 345 99 01
# description: Nagios network monitor
# processname: nagios
# File : nagios
#
# Author : Jorge Sanchez Aymar (jsanchez@lanchile.cl)
#
# Changelog :
#
# 1999-07-09 Karl DeBisschop <kdebisschop@infoplease.com>
# - setup for autoconf
# - add reload function
# 1999-08-06 Ethan Galstad <egalstad@nagios.org>
# - Added configuration info for use with RedHat's chkconfig tool
# per Fran Boon's suggestion
# 1999-08-13 Jim Popovitch <jimpop@rocketship.com>
# - added variable for nagios/var directory
# - cd into nagios/var directory before creating tmp files on startup
# 1999-08-16 Ethan Galstad <egalstad@nagios.org>
# - Added test for rc.d directory as suggested by Karl DeBisschop
# 2000-07-23 Karl DeBisschop <kdebisschop@users.sourceforge.net>
# - Clean out redhat macros and other dependencies
# 2003-01-11 Ethan Galstad <egalstad@nagios.org>
# - Updated su syntax (Gary Miller)
#
# Description: Starts and stops the Nagios monitor
# used to provide network services status.
#
### BEGIN INIT INFO
# Provides: nagios
# Required-Start: $local_fs $syslog $network
# Required-Stop: $local_fs $syslog $network
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Starts and stops the Nagios monitoring server
# Description: Starts and stops the Nagios monitoring server
### END INIT INFO
# Our install-time configuration.
prefix=/usr/local/nagios
exec_prefix=${prefix}
NagiosBin=${exec_prefix}/bin/nagios
NagiosCfgFile=${prefix}/etc/nagios.cfg
NagiosCfgtestFile=${prefix}/var/nagios.configtest
NagiosStatusFile=${prefix}/var/status.dat
NagiosRetentionFile=${prefix}/var/retention.dat
NagiosCommandFile=${prefix}/var/rw/nagios.cmd
NagiosVarDir=${prefix}/var
NagiosRunFile=/run/nagios.lock
NagiosLockDir=/var/lock/subsys
NagiosLockFile=nagios
NagiosCGIDir=${exec_prefix}/sbin
NagiosUser=nagios
NagiosGroup=nagios
checkconfig="true"
# Source function library
# Some *nix do not have an rc.d directory, so do a test first
if [ -f /etc/rc.d/init.d/functions ]; then
. /etc/rc.d/init.d/functions
elif [ -f /etc/init.d/functions ]; then
. /etc/init.d/functions
elif [ -f /lib/lsb/init-functions ]; then
. /lib/lsb/init-functions
fi
# Load any extra environment variables for Nagios and its plugins.
if test -f /etc/sysconfig/nagios; then
. /etc/sysconfig/nagios
fi
# Automate addition of RAMDISK based on environment variables
USE_RAMDISK=${USE_RAMDISK:-0}
if test "$USE_RAMDISK" -ne 0 && test "$RAMDISK_SIZE"X != "X"; then
ramdisk=`mount |grep "${RAMDISK_DIR} type tmpfs"`
if [ "$ramdisk"X == "X" ]; then
mkdir -p -m 0755 ${RAMDISK_DIR}
mount -t tmpfs -o size=${RAMDISK_SIZE}m tmpfs ${RAMDISK_DIR}
mkdir -p -m 0755 ${RAMDISK_DIR}/checkresults
chown -h -R $NagiosUser:$NagiosGroup ${RAMDISK_DIR}
fi
fi
check_config ()
{
rm -f "$NagiosCfgtestFile";
if test -e "$NagiosCfgtestFile"; then
echo "ERROR: Could not delete '$NagiosCfgtestFile'"
exit 8
fi
if ! su $NagiosUser -c "touch $NagiosCfgtestFile"; then
echo "ERROR: Could not create or update '$NagiosCfgtestFile'"
exit 8
fi
TMPFILE=$(mktemp /tmp/.configtest.XXXXXXXX)
$NagiosBin -vp $NagiosCfgFile > "$TMPFILE"
WARN=`grep ^"Total Warnings:" "$TMPFILE" |awk -F: '{print \$2}' |sed s/' '//g`
ERR=`grep ^"Total Errors:" "$TMPFILE" |awk -F: '{print \$2}' |sed s/' '//g`
if test "$WARN" = "0" && test "${ERR}" = "0"; then
echo "OK - Configuration check verified" > $NagiosCfgtestFile
/bin/rm "$TMPFILE"
return 0
elif test "${ERR}" = "0"; then
# Write the errors to a file we can have a script watching for.
echo "WARNING: Warnings in config files - see log for details: $NagiosCfgtestFile" > $NagiosCfgtestFile
egrep -i "(^warning|^error)" "$TMPFILE" >> $NagiosCfgtestFile
/bin/rm "$TMPFILE"
return 0
else
# Write the errors to a file we can have a script watching for.
echo "ERROR: Errors in config files - see log for details: $NagiosCfgtestFile" > $NagiosCfgtestFile
egrep -i "(^warning|^error)" "$TMPFILE" >> $NagiosCfgtestFile
cat "$TMPFILE"
exit 8
fi
}
status_nagios ()
{
if test -x $NagiosCGI/daemonchk.cgi; then
if $NagiosCGI/daemonchk.cgi -l $NagiosRunFile > /dev/null 2>&1; then return 0; fi
else
if ps -p $NagiosPID > /dev/null 2>&1; then return 0; fi
fi
return 1
}
printstatus_nagios ()
{
if status_nagios; then
echo "nagios (pid $NagiosPID) is running..."
else
echo "nagios is not running"
fi
}
killproc_nagios ()
{
kill -s "$1" $NagiosPID
}
pid_nagios ()
{
if test ! -f $NagiosRunFile; then
echo "No lock file found in $NagiosRunFile"
exit 1
fi
NagiosPID=`head -n 1 $NagiosRunFile`
}
remove_commandfile ()
{
# Removing a stalled command file, while there are processes trying/waiting to write into it,
# will deadlock those processes in a blocking open() system call. To allow such processes to
# die on a broken pipe, the pipe must be opened for reading without actually reading from it,
# which is what dd does here. To avoid a chicken-egg problem, the pipe is renamed beforehand.
# In order for the dd to not deadlock when there is no writing process, it is executed in the
# background in a subshell together with an empty echo to have at least one writing process.
# see http://unix.stackexchange.com/questions/335406/opening-named-pipe-blocks-forever-if-pipe-is-deleted-without-being-connected
if [ -p $NagiosCommandFile ]; then
mv -f $NagiosCommandFile $NagiosCommandFile~
(dd if=$NagiosCommandFile~ count=0 2>/dev/null & echo -n "" >$NagiosCommandFile~)
fi
rm -f $NagiosCommandFile $NagiosCommandFile~
}
# Check that nagios exists.
if [ ! -f $NagiosBin ]; then
echo "Executable file $NagiosBin not found. Exiting."
exit 1
fi
# Check that nagios.cfg exists.
if [ ! -f $NagiosCfgFile ]; then
echo "Configuration file $NagiosCfgFile not found. Exiting."
exit 1
fi
# See how we were called.
case "$1" in
start)
echo -n "Starting nagios:"
if test "$checkconfig" = "true"; then
check_config
# check_config exits on configuration errors.
fi
if test -f $NagiosRunFile; then
NagiosPID=`head -n 1 $NagiosRunFile`
if status_nagios; then
echo " another instance of nagios is already running."
exit 0
fi
fi
su $NagiosUser -c "touch $NagiosVarDir/nagios.log $NagiosRetentionFile"
remove_commandfile
touch $NagiosRunFile
$NagiosBin -d $NagiosCfgFile
if [ -d $NagiosLockDir ]; then touch $NagiosLockDir/$NagiosLockFile; fi
echo " done."
;;
stop)
echo -n "Stopping nagios:"
pid_nagios
killproc_nagios TERM
# now we have to wait for nagios to exit and remove its
# own NagiosRunFile, otherwise a following "start" could
# happen, and then the exiting nagios will remove the
# new NagiosRunFile, allowing multiple nagios daemons
# to (sooner or later) run - John Sellens
#echo -n 'Waiting for nagios to exit .'
for i in {1..90}; do
if status_nagios > /dev/null; then
echo -n '.'
sleep 1
else
break
fi
done
if status_nagios > /dev/null; then
echo ''
echo 'Warning - nagios did not exit in a timely manner - Killing it!'
killproc_nagios KILL
else
echo ' done.'
fi
remove_commandfile
rm -f $NagiosStatusFile $NagiosRunFile $NagiosLockDir/$NagiosLockFile
;;
status)
pid_nagios
printstatus_nagios
;;
checkconfig)
if test "$checkconfig" = "true"; then
printf "Running configuration check...\n"
check_config
fi
if [ $? -eq 0 ]; then
echo " OK."
else
echo " CONFIG ERROR! Check your Nagios configuration."
exit 1
fi
;;
restart)
if test "$checkconfig" = "true"; then
printf "Running configuration check...\n"
check_config
fi
$0 stop
$0 start
;;
reload|force-reload)
if test "$checkconfig" = "true"; then
printf "Running configuration check...\n"
check_config
fi
if test ! -f $NagiosRunFile; then
$0 start
else
pid_nagios
if status_nagios > /dev/null; then
printf "Reloading nagios configuration...\n"
killproc_nagios HUP
echo "done"
else
$0 stop
$0 start
fi
fi
;;
configtest)
$NagiosBin -vp $NagiosCfgFile
;;
*)
echo "Usage: nagios {start|stop|restart|reload|force-reload|status|checkconfig|configtest}"
exit 1
;;
esac
# End of this script
Re: service start nagios: he control process exited with err
Posted: Wed Sep 06, 2017 8:18 am
by tgriep
It may be a permission problem for the files or folders, can you run the following as root and post the output so we can check the permissions?
Code: Select all
ls -l /usr/local/nagios/
ls -l /usr/local/nagios/var/
ls -l /
ls -al /tmp
Re: service start nagios: he control process exited with err
Posted: Wed Sep 06, 2017 3:51 pm
by misja
Hi thanks for your help so far. hereby the results:
Code: Select all
root@debian-d510:~# ls -l /usr/local/nagios/
total 28
drwxrwsr-x 2 nagios nagios 4096 Aug 29 22:44 bin
drwxrwsr-x 3 nagios nagios 4096 Aug 29 23:31 etc
drwxr-sr-x 2 root staff 4096 Aug 13 23:16 include
drwxrwsr-x 3 nagios nagios 4096 Aug 13 23:16 libexec
drwxrwsr-x 2 nagios nagios 4096 Aug 29 22:44 sbin
drwxrwsr-x 15 nagios nagios 4096 Aug 29 22:44 share
drwxrwsr-x 5 nagios nagios 4096 Sep 1 23:31 var
Code: Select all
root@debian-d510:~# ls -l /usr/local/nagios/var/
total 104
drwxrwsr-x 2 nagios nagios 4096 Aug 25 00:00 archives
-rw-r--r-- 1 nagios nagios 13415 Aug 29 22:36 nagios.log
-rw-r--r-- 1 nagios nagios 20674 Aug 29 22:28 objects.cache
-rw-r--r-- 1 nagios nagios 20674 Aug 29 22:28 objects.precache
-rw------- 1 nagios nagios 26071 Aug 29 22:36 retention.dat
drwxrwsr-x 2 nagios www-data 4096 Aug 29 22:36 rw
drwxr-sr-x 3 root nagios 4096 Aug 13 23:08 spool
Code: Select all
root@debian-d510:~# ls -l /
total 72
drwxr-xr-x 2 root root 4096 Aug 13 22:55 bin
drwxr-xr-x 3 root root 4096 Aug 13 22:49 boot
drwxr-xr-x 17 root root 3120 Sep 6 22:33 dev
drwxr-xr-x 106 root root 4096 Aug 25 17:20 etc
drwxr-xr-x 3 root root 4096 Aug 10 21:46 home
lrwxrwxrwx 1 root root 31 Aug 10 21:28 initrd.img -> boot/initrd.img-4.9.0-3-686-pae
lrwxrwxrwx 1 root root 31 Aug 10 21:28 initrd.img.old -> boot/initrd.img-4.9.0-3-686-pae
drwxr-xr-x 16 root root 4096 Aug 13 22:49 lib
drwx------ 2 root root 16384 Aug 10 21:23 lost+found
drwxr-xr-x 4 root root 4096 Aug 11 17:14 media
drwxr-xr-x 3 root root 4096 Sep 1 11:11 mnt
drwxr-xr-x 2 root root 4096 Aug 10 21:24 opt
dr-xr-xr-x 86 root root 0 Sep 6 22:33 proc
drwx------ 4 root root 4096 Sep 2 00:42 root
drwxr-xr-x 20 root root 640 Sep 6 22:42 run
drwxr-xr-x 2 root root 4096 Aug 13 22:49 sbin
drwxr-xr-x 2 root root 4096 Aug 10 21:24 srv
dr-xr-xr-x 13 root root 0 Sep 6 22:45 sys
drwxrwxrwt 8 root root 4096 Sep 6 22:43 tmp
drwxr-xr-x 10 root root 4096 Aug 10 21:24 usr
drwxr-xr-x 12 root root 4096 Aug 10 21:33 var
lrwxrwxrwx 1 root root 28 Aug 10 21:28 vmlinuz -> boot/vmlinuz-4.9.0-3-686-pae
lrwxrwxrwx 1 root root 28 Aug 10 21:28 vmlinuz.old -> boot/vmlinuz-4.9.0-3-686-pae
Code: Select all
root@debian-d510:~# ls -al /tmp
total 32
drwxrwxrwt 8 root root 4096 Sep 6 22:43 .
drwxr-xr-x 21 root root 4096 Aug 28 17:57 ..
drwxrwxrwt 2 root root 4096 Sep 6 22:33 .font-unix
drwxrwxrwt 2 root root 4096 Sep 6 22:33 .ICE-unix
drwx------ 3 root root 4096 Sep 6 22:34 systemd-private-8fd4a4c2e68841799c8d6d28cd263583-apache2.service-sYvbkS
drwxrwxrwt 2 root root 4096 Sep 6 22:33 .Test-unix
drwxrwxrwt 2 root root 4096 Sep 6 22:33 .X11-unix
drwxrwxrwt 2 root root 4096 Sep 6 22:33 .XIM-unix