Nagios suddenly refuses to start

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Mortus
Posts: 27
Joined: Tue Nov 15, 2016 10:34 am

Re: Nagios suddenly refuses to start

Post by Mortus »

/etc/rc.d/init.d/nagios start provides the same output that I linked earlier. The systemctl command isn't found.

And The scite was set up to allow me to edit the config and save it directly to the server.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Nagios suddenly refuses to start

Post by dwhitfield »

Can you post your /etc/rc.d/init.d/nagios? Also, can you post the output of ls -la /etc/rc.d/init.d/nagios?
Mortus
Posts: 27
Joined: Tue Nov 15, 2016 10:34 am

Re: Nagios suddenly refuses to start

Post by Mortus »

[root@devnagios ~]# /etc/rc.d/init.d/nagios
Usage: nagios {start|stop|restart|reload|force-reload|status|checkconfig}
[root@devnagios ~]# ls -la /etc/rc.d/init.d/nagios
-rwxr-xr-x 1 root root 5373 Jun 19 2014 /etc/rc.d/init.d/nagios
[root@devnagios ~]#
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Nagios suddenly refuses to start

Post by dwhitfield »

I wanted to see the actual script /etc/rc.d/init.d/nagios...although that modification date suggests that won't be useful.

If the output of echo $0 is bash, please run the following and then give the output

Code: Select all

cd /root
cat .bashrc
Probably, we're going to want to look at /etc/bashrc as well, although the output of .bashrc should tell you if that's the file we want.

If echo $0 is not bash, then we'll need to look in different locations for scripts, but bash is the default.

Basically, I'm trying to figure out what is calling scite.
Mortus
Posts: 27
Joined: Tue Nov 15, 2016 10:34 am

Re: Nagios suddenly refuses to start

Post by Mortus »

Code: Select all

# .bashrc

# User specific aliases and functions

alias rm='rm -i'
alias cp='cp -i'
alias mv='mv -i'

alias preflight='/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg'
alias diskspace="du -h / | grep -P '^[0-9\.]+G'"
alias clearmailqueues='for f in /var/spool/mqueue/* ; do rm -f "$f"; done && for f in /var/spool/clientmqueue/* ; do rm -f "$f"; done'

# Source global definitions
if [ -f /etc/bashrc ]; then
        . /etc/bashrc
fi

Code: Select all

# /etc/bashrc

# System wide functions and aliases
# Environment stuff goes in /etc/profile

# are we an interactive shell?
if [ "$PS1" ]; then
  if [ -z "$PROMPT_COMMAND" ]; then
    case $TERM in
        xterm*)
                if [ -e /etc/sysconfig/bash-prompt-xterm ]; then
                        PROMPT_COMMAND=/etc/sysconfig/bash-prompt-xterm
                else
            PROMPT_COMMAND='printf "\033]0;%s@%s:%s\007" "${USER}" "${HOSTNAME%%.*}" "${PWD/#$HOME/~}"'
                fi
                ;;
        screen)
                if [ -e /etc/sysconfig/bash-prompt-screen ]; then
                        PROMPT_COMMAND=/etc/sysconfig/bash-prompt-screen
                else
            PROMPT_COMMAND='printf "\033]0;%s@%s:%s\033\\" "${USER}" "${HOSTNAME%%.*}" "${PWD/#$HOME/~}"'
                fi
                ;;
        *)
                [ -e /etc/sysconfig/bash-prompt-default ] && PROMPT_COMMAND=/etc/sysconfig/bash-prompt-default
            ;;
    esac
  fi
  # Turn on checkwinsize
  shopt -s checkwinsize
  [ "$PS1" = "\\s-\\v\\\$ " ] && PS1="[\u@\h \W]\\$ "
fi

if ! shopt -q login_shell ; then # We're not a login shell
        # Need to redefine pathmunge, it get's undefined at the end of /etc/profile
    pathmunge () {
                if ! echo $PATH | /bin/egrep -q "(^|:)$1($|:)" ; then
                        if [ "$2" = "after" ] ; then
                                PATH=$PATH:$1
                        else
                                PATH=$1:$PATH
                        fi
                fi
        }

    # By default, we want umask to get set. This sets it for non-login shell.
    # You could check uidgid reservation validity in
    # /usr/share/doc/setup-*/uidgid file
    if [ $UID -gt 99 ] && [ "`id -gn`" = "`id -un`" ]; then
       umask 002
    else
       umask 022
    fi

        # Only display echos from profile.d scripts if we are no login shell
    # and interactive - otherwise just process them to set envvars
    for i in /etc/profile.d/*.sh; do
        if [ -r "$i" ]; then
            if [ "$PS1" ]; then
                . $i
            else
                . $i >/dev/null 2>&1
            fi
        fi
    done

        unset i
        unset pathmunge
fi
# vim:ts=4:sw=4

If it's helpful, here's the nagios script:

Code: Select all

#!/bin/sh
# 
# chkconfig: 345 99 01
# description: Nagios network monitor
#
# File : nagios
#
# Author : Jorge Sanchez Aymar (jsanchez@lanchile.cl)
# 
# Changelog :
#
# 1999-07-09 Karl DeBisschop <kdebisschop@infoplease.com>
#  - setup for autoconf
#  - add reload function
# 1999-08-06 Ethan Galstad <egalstad@nagios.org>
#  - Added configuration info for use with RedHat's chkconfig tool
#    per Fran Boon's suggestion
# 1999-08-13 Jim Popovitch <jimpop@rocketship.com>
#  - added variable for nagios/var directory
#  - cd into nagios/var directory before creating tmp files on startup
# 1999-08-16 Ethan Galstad <egalstad@nagios.org>
#  - Added test for rc.d directory as suggested by Karl DeBisschop
# 2000-07-23 Karl DeBisschop <kdebisschop@users.sourceforge.net>
#  - Clean out redhat macros and other dependencies
# 2003-01-11 Ethan Galstad <egalstad@nagios.org>
#  - Updated su syntax (Gary Miller)
#
# Description: Starts and stops the Nagios monitor
#              used to provide network services status.
#

mkdir -p -m 775 /dev/shm/nagios
mkdir -p -m 775 /dev/shm/nagios/tmp
mkdir -p -m 775 /dev/shm/nagios/spool
mkdir -p -m 775 /dev/shm/nagios/spool/checkresults
chown -R nagios.nagios /dev/shm/nagios

status_nagios ()
{

	if test -x $NagiosCGI/daemonchk.cgi; then
		if $NagiosCGI/daemonchk.cgi -l $NagiosRunFile; then
		        return 0
		else
			return 1
		fi
	else
		if ps -p $NagiosPID > /dev/null 2>&1; then
		        return 0
		else
			return 1
		fi
	fi

	return 1
}


printstatus_nagios()
{

	if status_nagios $1 $2; then
		echo "nagios (pid $NagiosPID) is running..."
	else
		echo "nagios is not running"
	fi
}


killproc_nagios ()
{

	kill $2 $NagiosPID

}


pid_nagios ()
{

	if test ! -f $NagiosRunFile; then
		echo "No lock file found in $NagiosRunFile"
		exit 1
	fi

	NagiosPID=`head -n 1 $NagiosRunFile`
}


# Source function library
# Solaris doesn't have an rc.d directory, so do a test first
if [ -f /etc/rc.d/init.d/functions ]; then
	. /etc/rc.d/init.d/functions
elif [ -f /etc/init.d/functions ]; then
	. /etc/init.d/functions
fi

prefix=/usr/local/nagios
exec_prefix=${prefix}
NagiosBin=${exec_prefix}/bin/nagios
NagiosCfgFile=${prefix}/etc/nagios.cfg
NagiosStatusFile=${prefix}/var/status.dat
NagiosRetentionFile=${prefix}/var/retention.dat
NagiosCommandFile=${prefix}/var/rw/nagios.cmd
NagiosVarDir=${prefix}/var
NagiosRunFile=${prefix}/var/nagios.lock
NagiosLockDir=/var/lock/subsys
NagiosLockFile=nagios
NagiosCGIDir=${exec_prefix}/sbin
NagiosUser=nagios
NagiosGroup=nagios
          

# Check that nagios exists.
if [ ! -f $NagiosBin ]; then
    echo "Executable file $NagiosBin not found.  Exiting."
    exit 1
fi

# Check that nagios.cfg exists.
if [ ! -f $NagiosCfgFile ]; then
    echo "Configuration file $NagiosCfgFile not found.  Exiting."
    exit 1
fi
          
# See how we were called.
case "$1" in

	start)
		echo -n "Starting nagios:"
		$NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
		if [ $? -eq 0 ]; then
			su - $NagiosUser -c "touch $NagiosVarDir/nagios.log $NagiosRetentionFile"
			rm -f $NagiosCommandFile
			touch $NagiosRunFile
			chown $NagiosUser:$NagiosGroup $NagiosRunFile
			$NagiosBin -d $NagiosCfgFile
			if [ -d $NagiosLockDir ]; then touch $NagiosLockDir/$NagiosLockFile; fi
			echo " done."
			exit 0
		else
			echo "CONFIG ERROR!  Start aborted.  Check your Nagios configuration."
			exit 1
		fi
		;;

	stop)
		echo -n "Stopping nagios: "

		pid_nagios
		killproc_nagios nagios

 		# now we have to wait for nagios to exit and remove its
 		# own NagiosRunFile, otherwise a following "start" could
 		# happen, and then the exiting nagios will remove the
 		# new NagiosRunFile, allowing multiple nagios daemons
 		# to (sooner or later) run - John Sellens
		#echo -n 'Waiting for nagios to exit .'
 		for i in 1 2 3 4 5 6 7 8 9 10 ; do
 		    if status_nagios > /dev/null; then
 			echo -n '.'
 			sleep 1
 		    else
 			break
 		    fi
 		done
 		if status_nagios > /dev/null; then
 		    echo ''
 		    echo 'Warning - nagios did not exit in a timely manner'
 		else
 		    echo 'done.'
 		fi

		rm -f $NagiosStatusFile $NagiosRunFile $NagiosLockDir/$NagiosLockFile $NagiosCommandFile
		;;

	status)
		pid_nagios
		printstatus_nagios nagios
		;;

	checkconfig)
		printf "Running configuration check..."
		$NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
		if [ $? -eq 0 ]; then
			echo " OK."
		else
			echo " CONFIG ERROR!  Check your Nagios configuration."
			exit 1
		fi
		;;

	restart)
		printf "Running configuration check..."
		$NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
		if [ $? -eq 0 ]; then
			echo "done."
			$0 stop
			$0 start
		else
			echo " CONFIG ERROR!  Restart aborted.  Check your Nagios configuration."
			exit 1
		fi
		;;

	reload|force-reload)
		printf "Running configuration check..."
		$NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
		if [ $? -eq 0 ]; then
			echo "done."
			if test ! -f $NagiosRunFile; then
				$0 start
			else
				pid_nagios
				if status_nagios > /dev/null; then
					printf "Reloading nagios configuration..."
					killproc_nagios nagios -HUP
					echo "done"
				else
					$0 stop
					$0 start
				fi
			fi
		else
			echo " CONFIG ERROR!  Reload aborted.  Check your Nagios configuration."
			exit 1
		fi
		;;

	*)
		echo "Usage: nagios {start|stop|restart|reload|force-reload|status|checkconfig}"
		exit 1
		;;

esac
  
# End of this script
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Nagios suddenly refuses to start

Post by dwhitfield »

I'm still digging. Can you post the output of sh -x /etc/init.d/nagios start?
Mortus
Posts: 27
Joined: Tue Nov 15, 2016 10:34 am

Re: Nagios suddenly refuses to start

Post by Mortus »

Code: Select all

[root@devnagios etc]# sh -x /etc/init.d/nagios start
+ mkdir -p -m 775 /dev/shm/nagios
+ mkdir -p -m 775 /dev/shm/nagios/tmp
+ mkdir -p -m 775 /dev/shm/nagios/spool
+ mkdir -p -m 775 /dev/shm/nagios/spool/checkresults
+ chown -R nagios.nagios /dev/shm/nagios
+ '[' -f /etc/rc.d/init.d/functions ']'
+ . /etc/rc.d/init.d/functions
++ TEXTDOMAIN=initscripts
++ umask 022
++ PATH=/sbin:/usr/sbin:/bin:/usr/bin
++ export PATH
++ '[' -z '' ']'
++ COLUMNS=80
++ '[' -z '' ']'
+++ /sbin/consoletype
++ CONSOLETYPE=pty
++ '[' -f /etc/sysconfig/i18n -a -z '' ']'
++ . /etc/profile.d/lang.sh
+++ sourced=0
+++ '[' -z '' -a -n en_US.UTF-8 ']'
+++ sourced=1
+++ '[' -n '' ']'
+++ '[' 1 = 1 ']'
+++ '[' -n en_US.UTF-8 ']'
+++ export LANG
+++ '[' -n '' ']'
+++ unset LC_ADDRESS
+++ '[' -n '' ']'
+++ unset LC_CTYPE
+++ '[' -n '' ']'
+++ unset LC_COLLATE
+++ '[' -n '' ']'
+++ unset LC_IDENTIFICATION
+++ '[' -n '' ']'
+++ unset LC_MEASUREMENT
+++ '[' -n '' ']'
+++ unset LC_MESSAGES
+++ '[' -n '' ']'
+++ unset LC_MONETARY
+++ '[' -n '' ']'
+++ unset LC_NAME
+++ '[' -n '' ']'
+++ unset LC_NUMERIC
+++ '[' -n '' ']'
+++ unset LC_PAPER
+++ '[' -n '' ']'
+++ unset LC_TELEPHONE
+++ '[' -n '' ']'
+++ unset LC_TIME
+++ '[' -n '' ']'
+++ unset LC_ALL
+++ '[' -n '' ']'
+++ unset LANGUAGE
+++ '[' -n '' ']'
+++ unset LINGUAS
+++ '[' -n '' ']'
+++ unset _XKB_CHARSET
+++ consoletype=pty
+++ '[' -z pty ']'
+++ '[' -n '' ']'
+++ '[' -n '' ']'
+++ '[' -n en_US.UTF-8 ']'
+++ case $LANG in
+++ '[' xterm = linux ']'
+++ unset SYSFONTACM SYSFONT
+++ unset sourced
+++ unset langfile
++ '[' -z '' ']'
++ '[' -f /etc/sysconfig/init ']'
++ . /etc/sysconfig/init
+++ BOOTUP=color
+++ GRAPHICAL=yes
+++ RES_COL=60
+++ MOVE_TO_COL='echo -en \033[60G'
+++ SETCOLOR_SUCCESS='echo -en \033[0;32m'
+++ SETCOLOR_FAILURE='echo -en \033[0;31m'
+++ SETCOLOR_WARNING='echo -en \033[0;33m'
+++ SETCOLOR_NORMAL='echo -en \033[0;39m'
+++ LOGLEVEL=3
+++ PROMPT=yes
+++ AUTOSWAP=no
++ '[' pty = serial ']'
++ '[' color '!=' verbose ']'
++ INITLOG_ARGS=-q
++ __sed_discard_ignored_files='/\(~\|\.bak\|\.orig\|\.rpmnew\|\.rpmorig\|\.rpmsave\)$/d'
+ prefix=/usr/local/nagios
+ exec_prefix=/usr/local/nagios
+ NagiosBin=/usr/local/nagios/bin/nagios
+ NagiosCfgFile=/usr/local/nagios/etc/nagios.cfg
+ NagiosStatusFile=/usr/local/nagios/var/status.dat
+ NagiosRetentionFile=/usr/local/nagios/var/retention.dat
+ NagiosCommandFile=/usr/local/nagios/var/rw/nagios.cmd
+ NagiosVarDir=/usr/local/nagios/var
+ NagiosRunFile=/usr/local/nagios/var/nagios.lock
+ NagiosLockDir=/var/lock/subsys
+ NagiosLockFile=nagios
+ NagiosCGIDir=/usr/local/nagios/sbin
+ NagiosUser=nagios
+ NagiosGroup=nagios
+ '[' '!' -f /usr/local/nagios/bin/nagios ']'
+ '[' '!' -f /usr/local/nagios/etc/nagios.cfg ']'
+ case "$1" in
+ echo -n 'Starting nagios:'
Starting nagios:+ /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
+ '[' 0 -eq 0 ']'
+ su - nagios -c 'touch /usr/local/nagios/var/nagios.log /usr/local/nagios/var/retention.dat'
+ rm -f /usr/local/nagios/var/rw/nagios.cmd
+ touch /usr/local/nagios/var/nagios.lock
+ chown nagios:nagios /usr/local/nagios/var/nagios.lock
+ /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
+ '[' -d /var/lock/subsys ']'
+ touch /var/lock/subsys/nagios
+ echo ' done.'
 done.
+ exit 0
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Nagios suddenly refuses to start

Post by dwhitfield »

What's the output of ls -la /etc/sysconfig/init?

Also, in that /etc/sysconfig/init file, if you have a line that says GRAPHICAL=yes, change it to GRAPHICAL=no.

Whether this exists or not is going to depend on a number of things. It shows up as yes on my CentOS 5 install without issue, but it should help us bypass the gtk issue.
Mortus
Posts: 27
Joined: Tue Nov 15, 2016 10:34 am

Re: Nagios suddenly refuses to start

Post by Mortus »

-rw-r--r-- 1 root root 1068 Mar 19 2014 /etc/sysconfig/init

And I've made the change to no on graphical.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Nagios suddenly refuses to start

Post by dwhitfield »

On both the dev and the prod can you run rpm -qa nagios and give the output? Also, can you check and make sure all the files we've asked for so far are the same on dev and prod?

We might getting to the point where you just clone the prod to make a new dev. How far you want to go before that is up to you though.
Locked