Pre/Post Install problems with nagios Core on RHEL 5.8

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
gadikota
Posts: 9
Joined: Sun Apr 15, 2012 1:30 am

Pre/Post Install problems with nagios Core on RHEL 5.8

Post by gadikota »

Team,

I have installed nagios before in 4 different occasions and every one of it was successful with no issues. But the current install is bothering me a lot.

I am using RHEL 5.8 for use with Nagios Core 3.5 and Plugins at 1.4.16 which are current stable versions.
bash-3.2# uname -a
Linux nagiosl851 2.6.18-308.16.1.el5 #1 SMP Tue Sep 18 07:21:07 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
bash-3.2# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.8 (Tikanga)
bash-3.2#
Pre Install Issues:
As per the installation steps the first step was to add a user and group called nagios and nagcmd. When i ran the command "useradd nagios" it came back and said the user exists and then immediately i looked in to the /etc/passwd file to see the user exists or not.. but it returned nothing. Instead of trying the same and failing, I started to use dpnagios and dpnagcmd as the user and group names.
bash-3.2# useradd nagios
useradd: user nagios exists
bash-3.2# cat /etc/passwd | grep nagios
bash-3.2#
New user and group details
bash-3.2# cat /etc/passwd | grep dpnagios
dpnagios:x:36135:36135::/home/dpnagios:/bin/bash
bash-3.2# cat /etc/group | grep nag
dpnagios:x:36135:dpnagios,apache
dpnagcmd:x:36136:dpnagios,apache
bash-3.2#
and then followed the next steps to install core with the right user and group names with the configure command. Everything went smooth and completed both the installation of Core and Plugins. When i noticed the /usr/local/nagios folder permissions they are all set to nagios and nagios. I went ahead and fired off the below 2 commands to fix the permissions.
chown -Rv dpnagios nagios
chgrp -Rv dpnagios nagios
Post Install Issues:

Now that site is up and running. Started to verify the nagios.cfg file and it came back with 0 errors/warnings.

Tried to start nagios and httpd and both came and said done.

But when i login in to the site, i get the error any page i try to view. the error page is attached to this thread.
Nagios Web Page error.
Nagios Web Page error.
Now when i try to see if nagios is running and it came back and said "not running". Now i am not sure what the issue is and any help on this is much appreciated. Let me know if i missed anything you need to help me or if i did something wrong.

cfg file check:
bash-3.2# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Nagios Core 3.5.0
Copyright (c) 2009-2011 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 03-15-2013
License: GPL

Website: http://www.nagios.org
Reading configuration data...
Read main config file okay...
Processing object config file '/usr/local/nagios/etc/objects/commands.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/contacts.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/timeperiods.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/templates.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/localhost.cfg'...
Read object config files okay...

Running pre-flight check on configuration data...

Checking services...
Checked 8 services.
Checking hosts...
Checked 1 hosts.
Checking host groups...
Checked 1 host groups.
Checking service groups...
Checked 0 service groups.
Checking contacts...
Checked 1 contacts.
Checking contact groups...
Checked 1 contact groups.
Checking service escalations...
Checked 0 service escalations.
Checking service dependencies...
Checked 0 service dependencies.
Checking host escalations...
Checked 0 host escalations.
Checking host dependencies...
Checked 0 host dependencies.
Checking commands...
Checked 24 commands.
Checking time periods...
Checked 5 time periods.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors: 0

Things look okay - No serious problems were detected during the pre-flight check
bash-3.2#
Nagios Start:
bash-3.2# /etc/init.d/nagios start
Starting nagios:su: warning: cannot change directory to /home/dpnagios: No such file or directory
done.
bash-3.2# /etc/init.d/nagios status
nagios is not running
bash-3.2#
Again i did the same steps except user and group as nagios and nagcmd instead of dpnagios & dpnagcmd on centos 5.8 and the everything came back fine with no single issue. Any help on this issues i am seeing is much appreciated.

Thanks
Balu Gadikota
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Pre/Post Install problems with nagios Core on RHEL 5.8

Post by scottwilkerson »

gadikota wrote:bash-3.2# /etc/init.d/nagios start
Starting nagios:su: warning: cannot change directory to /home/dpnagios: No such file or directory
This looks like you have an error in your /etc/init.d/nagios script

Can you post it?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
gadikota
Posts: 9
Joined: Sun Apr 15, 2012 1:30 am

Re: Pre/Post Install problems with nagios Core on RHEL 5.8

Post by gadikota »

Scott,

Thank you for your reply.

Home directory in my environment is all NAS .. so i created a local home dir and tried to start nagios again and below are the results.

Code: Select all

bash-3.2# cat /etc/passwd | grep nag
dpnagios:x:36135:36135::/dpstore/localhome/dpnagios:/bin/bash
bash-3.2# ls -l /dpstore/localhome/
total 8
drwxr-xr-x 2 root amer 4096 Apr 19 16:21 dpnagios
drwxr-xr-x 2 root amer 4096 Apr 18 00:56 swatcher
bash-3.2# /etc/init.d/nagios start
Starting nagios: done.
bash-3.2# ps -ef | grep nagios
root     19338 17708  0 16:25 pts/1    00:00:00 grep nagios
bash-3.2#
and below is the nagios start up script. The only thing i changed in the script is the lines that are in quotes
NagiosUser=dpnagios
NagiosGroup=dpnagios
Start up script:

Code: Select all

bash-3.2# cat /etc/init.d/nagios
#!/bin/sh
#
# chkconfig: 345 99 01
# description: Nagios network monitor
#
# File : nagios
#
# Author : Jorge Sanchez Aymar ([email protected])
#
# Changelog :
#
# 1999-07-09 Karl DeBisschop <[email protected]>
#  - setup for autoconf
#  - add reload function
# 1999-08-06 Ethan Galstad <[email protected]>
#  - Added configuration info for use with RedHat's chkconfig tool
#    per Fran Boon's suggestion
# 1999-08-13 Jim Popovitch <[email protected]>
#  - added variable for nagios/var directory
#  - cd into nagios/var directory before creating tmp files on startup
# 1999-08-16 Ethan Galstad <[email protected]>
#  - Added test for rc.d directory as suggested by Karl DeBisschop
# 2000-07-23 Karl DeBisschop <[email protected]>
#  - Clean out redhat macros and other dependencies
# 2003-01-11 Ethan Galstad <[email protected]>
#  - Updated su syntax (Gary Miller)
#
# Description: Starts and stops the Nagios monitor
#              used to provide network services status.
#

# Load any extra environment variables for Nagios and its plugins
if test -f /etc/sysconfig/nagios; then
        . /etc/sysconfig/nagios
fi

status_nagios ()
{

        if test -x $NagiosCGI/daemonchk.cgi; then
                if $NagiosCGI/daemonchk.cgi -l $NagiosRunFile; then
                        return 0
                else
                        return 1
                fi
        else
                if ps -p $NagiosPID > /dev/null 2>&1; then
                        return 0
                else
                        return 1
                fi
        fi

        return 1
}


printstatus_nagios()
{

        if status_nagios $1 $2; then
                echo "nagios (pid $NagiosPID) is running..."
        else
                echo "nagios is not running"
        fi
}


killproc_nagios ()
{

        kill $2 $NagiosPID

}


pid_nagios ()
{

        if test ! -f $NagiosRunFile; then
                echo "No lock file found in $NagiosRunFile"
                exit 1
        fi

        NagiosPID=`head -n 1 $NagiosRunFile`
}


# Source function library
# Solaris doesn't have an rc.d directory, so do a test first
if [ -f /etc/rc.d/init.d/functions ]; then
        . /etc/rc.d/init.d/functions
elif [ -f /etc/init.d/functions ]; then
        . /etc/init.d/functions
fi

prefix=/usr/local/nagios
exec_prefix=${prefix}
NagiosBin=${exec_prefix}/bin/nagios
NagiosCfgFile=${prefix}/etc/nagios.cfg
NagiosStatusFile=${prefix}/var/status.dat
NagiosRetentionFile=${prefix}/var/retention.dat
NagiosCommandFile=${prefix}/var/rw/nagios.cmd
NagiosVarDir=${prefix}/var
NagiosRunFile=${prefix}/var/nagios.lock
NagiosLockDir=/var/lock/subsys
NagiosLockFile=nagios
NagiosCGIDir=${exec_prefix}/sbin
NagiosUser=dpnagios
NagiosGroup=dpnagios


# Check that nagios exists.
if [ ! -f $NagiosBin ]; then
    echo "Executable file $NagiosBin not found.  Exiting."
    exit 1
fi

# Check that nagios.cfg exists.
if [ ! -f $NagiosCfgFile ]; then
    echo "Configuration file $NagiosCfgFile not found.  Exiting."
    exit 1
fi

# See how we were called.
case "$1" in

        start)
                echo -n "Starting nagios:"
                $NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
                if [ $? -eq 0 ]; then
                        su - $NagiosUser -c "touch $NagiosVarDir/nagios.log $NagiosRetentionFile"
                        rm -f $NagiosCommandFile
                        touch $NagiosRunFile
                        chown $NagiosUser:$NagiosGroup $NagiosRunFile
                        $NagiosBin -d $NagiosCfgFile
                        if [ -d $NagiosLockDir ]; then touch $NagiosLockDir/$NagiosLockFile; fi
                        echo " done."
                        exit 0
                else
                        echo "CONFIG ERROR!  Start aborted.  Check your Nagios configuration."
                        exit 1
                fi
                ;;

        stop)
                echo -n "Stopping nagios: "

                pid_nagios
                killproc_nagios nagios

                # now we have to wait for nagios to exit and remove its
                # own NagiosRunFile, otherwise a following "start" could
                # happen, and then the exiting nagios will remove the
                # new NagiosRunFile, allowing multiple nagios daemons
                # to (sooner or later) run - John Sellens
                #echo -n 'Waiting for nagios to exit .'
                for i in 1 2 3 4 5 6 7 8 9 10 ; do
                    if status_nagios > /dev/null; then
                        echo -n '.'
                        sleep 1
                    else
                        break
                    fi
                done
                if status_nagios > /dev/null; then
                    echo ''
                    echo 'Warning - nagios did not exit in a timely manner'
                else
                    echo 'done.'
                fi

                rm -f $NagiosStatusFile $NagiosRunFile $NagiosLockDir/$NagiosLockFile $NagiosCommandFile
                ;;

        status)
                pid_nagios
                printstatus_nagios nagios
                ;;

        checkconfig)
                printf "Running configuration check..."
                $NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
                if [ $? -eq 0 ]; then
                        echo " OK."
                else
                        echo " CONFIG ERROR!  Check your Nagios configuration."
                        exit 1
                fi
                ;;

        restart)
                printf "Running configuration check..."
                $NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
                if [ $? -eq 0 ]; then
                        echo "done."
                        $0 stop
                        $0 start
                else
                        echo " CONFIG ERROR!  Restart aborted.  Check your Nagios configuration."
                        exit 1
                fi
                ;;

        reload|force-reload)
                printf "Running configuration check..."
                $NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
                if [ $? -eq 0 ]; then
                        echo "done."
                        if test ! -f $NagiosRunFile; then
                                $0 start
                        else
                                pid_nagios
                                if status_nagios > /dev/null; then
                                        printf "Reloading nagios configuration..."
                                        killproc_nagios nagios -HUP
                                        echo "done"
                                else
                                        $0 stop
                                        $0 start
                                fi
                        fi
                else
                        echo " CONFIG ERROR!  Reload aborted.  Check your Nagios configuration."
                        exit 1
                fi
                ;;

        *)
                echo "Usage: nagios {start|stop|restart|reload|force-reload|status|checkconfig}"
                exit 1
                ;;

esac

# End of this script
bash-3.2#
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Pre/Post Install problems with nagios Core on RHEL 5.8

Post by abrist »

It looks like after you created the home directory it is now starting. Is everything working now? Or are the cgis still broken?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
gadikota
Posts: 9
Joined: Sun Apr 15, 2012 1:30 am

Re: Pre/Post Install problems with nagios Core on RHEL 5.8

Post by gadikota »

Its still not working. I get the same error message when i go to the site.

see below where the service still doesn't show that its running after changing the home directory and trying to start the service. I was wondering if anyone has tried to install nagiso before with different username and group and be successful.

Code: Select all

bash-3.2# cat /etc/passwd | grep nag
dpnagios:x:36135:36135::/dpstore/localhome/dpnagios:/bin/bash
bash-3.2# ls -l /dpstore/localhome/
total 8
drwxr-xr-x 2 root amer 4096 Apr 19 16:21 dpnagios
drwxr-xr-x 2 root amer 4096 Apr 18 00:56 swatcher
bash-3.2# /etc/init.d/nagios start
Starting nagios: done.
bash-3.2# ps -ef | grep nagios
root     19338 17708  0 16:25 pts/1    00:00:00 grep nagios
bash-3.2#
I am now in a desperate situation where i need help and get this fixed soon.

Thanks in advance for the help.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Pre/Post Install problems with nagios Core on RHEL 5.8

Post by abrist »

gadikota wrote:I was wondering if anyone has tried to install nagiso before with different username and group and be successful.
Some have, though I hear it was not the most pleasant thing as there are quite a number of places that permissions would need to be fixed.
After you start the nagios process, what does the logs looks like?

Code: Select all

tail -25 /usr/local/nagios/var/nagios.log
tail -25 /var/log/messages
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
gadikota
Posts: 9
Joined: Sun Apr 15, 2012 1:30 am

Re: Pre/Post Install problems with nagios Core on RHEL 5.8

Post by gadikota »

I have restarted the process again and seems like /var/log/messages points that objects.cache permissions are wrong. Please see the log attached.

Code: Select all

bash-3.2# /etc/init.d/nagios start
Starting nagios: done.
bash-3.2# ps -ef | grep nagios
root     29370 29326  0 18:04 pts/1    00:00:00 grep nagios
bash-3.2# tail -24 /usr/local/nagios/var/nagios.log
[1366261924] SERVICE ALERT: localhost;HTTP;CRITICAL;HARD;1;(Return code of 127 is out of bounds - plugin may be missing)
[1366261954] Warning: Return code of 127 for check of host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.
[1366261954] HOST ALERT: localhost;DOWN;SOFT;3;(Return code of 127 is out of bounds - plugin may be missing)
[1366261964] Warning: Return code of 127 for check of service 'PING' on host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.
[1366261964] SERVICE ALERT: localhost;PING;CRITICAL;HARD;1;(Return code of 127 is out of bounds - plugin may be missing)
[1366261994] Warning: Return code of 127 for check of service 'Root Partition' on host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.
[1366261994] SERVICE ALERT: localhost;Root Partition;CRITICAL;HARD;1;(Return code of 127 is out of bounds - plugin may be missing)
[1366262024] Warning: Return code of 127 for check of host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.
[1366262024] HOST ALERT: localhost;DOWN;SOFT;4;(Return code of 127 is out of bounds - plugin may be missing)
[1366262034] Warning: Return code of 127 for check of service 'SSH' on host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.
[1366262034] SERVICE ALERT: localhost;SSH;CRITICAL;HARD;1;(Return code of 127 is out of bounds - plugin may be missing)
[1366262074] Warning: Return code of 127 for check of service 'Swap Usage' on host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.
[1366262074] SERVICE ALERT: localhost;Swap Usage;CRITICAL;HARD;1;(Return code of 127 is out of bounds - plugin may be missing)
[1366262094] Warning: Return code of 127 for check of host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.
[1366262094] HOST ALERT: localhost;DOWN;SOFT;5;(Return code of 127 is out of bounds - plugin may be missing)
[1366262114] Warning: Return code of 127 for check of service 'Total Processes' on host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.
[1366262114] SERVICE ALERT: localhost;Total Processes;CRITICAL;HARD;1;(Return code of 127 is out of bounds - plugin may be missing)
[1366262144] Warning: Return code of 127 for check of service 'Current Load' on host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.
[1366262164] Warning: Return code of 127 for check of host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.
[1366262164] HOST ALERT: localhost;DOWN;SOFT;6;(Return code of 127 is out of bounds - plugin may be missing)
[1366262184] Warning: Return code of 127 for check of service 'Current Users' on host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.
[1366262224] Warning: Return code of 127 for check of service 'HTTP' on host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.
[1366262234] Warning: Return code of 127 for check of host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.
[1366262234] HOST ALERT: localhost;DOWN;SOFT;7;(Return code of 127 is out of bounds - plugin may be missing)
bash-3.2# tail -25 /var/log/messages
Apr 22 18:04:43 nagiosl851 nagios: Nagios 3.5.0 starting... (PID=29364)
Apr 22 18:04:43 nagiosl851 nagios: Local time is Mon Apr 22 18:04:43 EDT 2013
Apr 22 18:04:43 nagiosl851 nagios: LOG VERSION: 2.0
Apr 22 18:04:43 nagiosl851 nagios: Warning: Could not open object cache file '/usr/local/nagios/var/objects.cache' for writing!
Apr 22 18:04:43 nagiosl851 nagios: Failed to obtain lock on file /usr/local/nagios/var/nagios.lock: Permission denied
Apr 22 18:04:43 nagiosl851 nagios: Bailing out due to errors encountered while attempting to daemonize... (PID=29364)
bash-3.2#
Permissions for Object.cache are as below

Code: Select all

bash-3.2# ls -l /usr/local/nagios/var/objects.cache
-rw-r--r-- 1 dpnagios dpnagios 12935 Apr 18 01:10 /usr/local/nagios/var/objects.cache
bash-3.2#
Should it have different permissions ?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Pre/Post Install problems with nagios Core on RHEL 5.8

Post by scottwilkerson »

That looks correct, but the problem could be further up the tree

Code: Select all

ls -ld /usr/local/nagios/var
ls -ld /usr/local/nagios
ls -ld /usr/local
Also it looks like you don't have the plugins installed or they are not in the correct directory
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
gadikota
Posts: 9
Joined: Sun Apr 15, 2012 1:30 am

Re: Pre/Post Install problems with nagios Core on RHEL 5.8

Post by gadikota »

Please see the requested data.

Code: Select all

 bash-3.2# ls -ld /usr/local/nagios/var
drwxrwxr-x 5 dpnagios dpnagios 4096 Apr 18 01:41 /usr/local/nagios/var
bash-3.2# ls -ld /usr/local/nagios
drwxrwxr-x 9 dpnagios dpnagios 4096 Apr 18 01:27 /usr/local/nagios
bash-3.2# ls -ld /usr/local
drwxr-xr-x 15 root root 4096 Apr 18 01:08 /usr/local
bash-3.2#
Also I am not sure how to show that we did nagios plugin install other than showing that plugins are there in /usr/local/nagios/libexec. If there is a way to see that plugins are installed please let me know i will share the output.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Pre/Post Install problems with nagios Core on RHEL 5.8

Post by abrist »

Do you have permissions for the lock file?

Code: Select all

ls -la /usr/local/nagios/var/nagios.lock
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Locked