After upgrade from 5.3.4 to 5.4.0, Database Backend error

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
SteveBeauchemin
Posts: 524
Joined: Mon Oct 14, 2013 7:19 pm

Re: After upgrade from 5.3.4 to 5.4.0, Database Backend erro

Post by SteveBeauchemin »

I have seen this while playing 'what if' and doing the ndoutils upgrade. If you truly have a cosmetic only GUI issue, as I did, then this is what I found.

In the Nagios XI GUI the "XI System Component Status" applet for the Database Backend will stay red until you make the lock file name match the new setup - the suffix was .lock and is now .pid

Code: Select all

sed -i 's/ndo2db.lock/ndo2db.pid/' /etc/rc.d/init.d/ndo2db
It follows that the configuration file should match the /etc/init.d file.
The file /usr/local/nagios/etc/ndo2db.cfg used to have

Code: Select all

lock_file=/usr/local/nagios/var/ndo2db.lock
The lock file used to be called lock. but now...

Code: Select all

lock_file=/usr/local/nagios/var/ndo2db.pid
They need to match - pid everywhere - or lock everywhere. Pick one and make both files consistent with each other.

This is a subtle issue I chased around for a bit. I prefer Green to Red in my GUI.

Just something I noticed before... It may not be your issue.

Thanks

Steve B
XI 5.7.3 / Core 4.4.6 / NagVis 1.9.8 / LiveStatus 1.5.0p11 / RRDCached 1.7.0 / Redis 3.2.8 /
SNMPTT / Gearman 0.33-7 / Mod_Gearman 3.0.7 / NLS 2.0.8 / NNA 2.3.1 /
NSClient 0.5.0 / NRPE Solaris 3.2.1 Linux 3.2.1 HPUX 3.2.1
jwelch
Posts: 225
Joined: Wed Sep 05, 2012 12:49 pm

Re: After upgrade from 5.3.4 to 5.4.0, Database Backend erro

Post by jwelch »

Thanks for the info. Unfortunately, my /usr/local/nagios/etc/ndo2db.cfg contains:

lock_file=/usr/local/nagios/var/ndo2db.lock

Which matches /usr/local/nagios/var/bdi2db.lock, which has the correct pid number:
# cat ndo2db.lock
8400

# ps -eaf | grep 8400
nagios 8400 1 0 04:29 ? 00:00:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios 8476 8400 0 04:29 ? 00:00:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg

I'll poke around the filesystem when I get in and see if I can figure out how XI determines the backend database status.
Probably buried in php or javascript somewhere...

Just fyi, that ndo2db.cfg file hasn't changed in a long time:
-rw-rw-r-- 1 apache nagios 2229 Aug 15 2012 ndo2db.cfg
so I'm guessing it must be something in the XI 5.4.0 code that changed.
SteveBeauchemin
Posts: 524
Joined: Mon Oct 14, 2013 7:19 pm

Re: After upgrade from 5.3.4 to 5.4.0, Database Backend erro

Post by SteveBeauchemin »

did you look at /etc/rc.d/init.d/ndo2db to see if it used lock versus pid?

I think the GUI uses the output of "/etc/rc.d/init.d/ndo2db status"

Steve B
XI 5.7.3 / Core 4.4.6 / NagVis 1.9.8 / LiveStatus 1.5.0p11 / RRDCached 1.7.0 / Redis 3.2.8 /
SNMPTT / Gearman 0.33-7 / Mod_Gearman 3.0.7 / NLS 2.0.8 / NNA 2.3.1 /
NSClient 0.5.0 / NRPE Solaris 3.2.1 Linux 3.2.1 HPUX 3.2.1
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: After upgrade from 5.3.4 to 5.4.0, Database Backend erro

Post by dwhitfield »

@jwelch, was Steve's message useful for you? Any new information?
jwelch
Posts: 225
Joined: Wed Sep 05, 2012 12:49 pm

Re: After upgrade from 5.3.4 to 5.4.0, Database Backend erro

Post by jwelch »

same place...(symlink)

/etc# /etc/init.d/ndo2db status
ndo2db (pid 8400) is running...
/etc# /etc/rc.d/init.d/ndo2db status
ndo2db (pid 8400) is running...
#

In order for me to do any troubleshooting, I need to know how the XI gui determines the database backend status. I got lost in php include files last night and never found anything useful.
avandemore
Posts: 1597
Joined: Tue Sep 27, 2016 4:57 pm

Re: After upgrade from 5.3.4 to 5.4.0, Database Backend erro

Post by avandemore »

Please see @jomann's earlier response:

https://support.nagios.com/forum/viewto ... 10#p207514

Code: Select all

/etc# /etc/init.d/ndo2db status
ndo2db (pid 8400) is running...
/etc# /etc/rc.d/init.d/ndo2db status
ndo2db (pid 8400) is running...
This doesn't tell us anything useful as we need know if the user nagios can get the output using the sudo infrastructure Nagios installs. Please try:

Code: Select all

# su - nagios
$ sudo /etc/init.d/ndo2db status
Previous Nagios employee
jwelch
Posts: 225
Joined: Wed Sep 05, 2012 12:49 pm

Re: After upgrade from 5.3.4 to 5.4.0, Database Backend erro

Post by jwelch »

# su - nagios
nagios ~]$ /etc/init.d/ndo2db status
ndo2db (pid 8400) is running...
nagios ~]$ sudo /etc/init.d/ndo2db status
ndo2db (pid 8400) is running...
nagios ~]$
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: After upgrade from 5.3.4 to 5.4.0, Database Backend erro

Post by dwhitfield »

Could you verify your /etc/init.d/ndo2db looks like the following? (if you prefer, you can post your ndo2db and I can run a diff)

Code: Select all

#!/bin/sh
#
### BEGIN INIT INFO
# Provides:          ndo2db
# Required-Start:
# Required-Stop:
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Nagios NDO2DB Initscript
# Description:       Nagios Data Out Daemon
### END INIT INFO

# chkconfig: 345 97 01
#
# File : ndo2db
#
# Author : Jorge Sanchez Aymar ([email protected])
#
# Changelog :
#
# 1999-07-09 Karl DeBisschop <[email protected]>
#  - setup for autoconf
#  - add reload function
# 1999-08-06 Ethan Galstad <[email protected]>
#  - Added configuration info for use with RedHat's chkconfig tool
#    per Fran Boon's suggestion
# 1999-08-13 Jim Popovitch <[email protected]>
#  - added variable for nagios/var directory
#  - cd into nagios/var directory before creating tmp files on startup
# 1999-08-16 Ethan Galstad <[email protected]>
#  - Added test for rc.d directory as suggested by Karl DeBisschop
# 2000-07-23 Karl DeBisschop <[email protected]>
#  - Clean out redhat macros and other dependencies
# 2003-01-11 Ethan Galstad <[email protected]>
#  - Updated su syntax (Gary Miller)
# 2009-07-11 Hendrik Bäcker <[email protected]>
#  - Rewrite ndo2db init script, inspired by Sascha Runschke
#
#

status_ndo2db ()
{

        pid_ndo2db

        if ps -p $Ndo2dbPID > /dev/null 2>&1; then
                return 0
        else
                if test -f $Ndo2dbLockDir/$Ndo2dbLockFile; then
                        return 2
                else
                        return 1
                fi
        fi

        return 1
}

printstatus_ndo2db()
{
        if status_ndo2db $1 $2; then
                echo "$servicename (pid $Ndo2dbPID) is running..."
                exit 0
        elif test $? == 2; then
                echo "$servicename is not running but subsystem locked"
                exit 1
        else
                echo "$servicename is not running"
                exit 1
        fi
}


killproc_ndo2db ()
{

        kill $2 $Ndo2dbPID

}


pid_ndo2db ()
{

        if test ! -f $Ndo2dbRunFile; then
                return 1
        fi

        Ndo2dbPID=`head -n 1 $Ndo2dbRunFile 2> /dev/null`
        return 0
}


# Source function library
# Solaris doesn't have an rc.d directory, so do a test first
if [ -f /etc/rc.d/init.d/functions ]; then
        . /etc/rc.d/init.d/functions
elif [ -f /etc/init.d/functions ]; then
        . /etc/init.d/functions
fi

servicename=ndo2db
prefix=/usr/local/nagios
exec_prefix=/usr/local/nagios
Ndo2dbBin=/usr/local/nagios/bin/ndo2db
Ndo2dbCfgFile=/usr/local/nagios/etc/ndo2db.cfg
Ndo2dbVarDir=/usr/local/nagios/var
Ndo2dbRunFile=$Ndo2dbVarDir/ndo2db.lock
#Ndo2dbLockDir=/var/lock/subsys
Ndo2dbLockDir=/usr/local/nagiosxi/var/subsys
Ndo2dbLockFile=ndo2db
Ndo2dbUser=nagios
Ndo2dbGroup=nagios


# Check that ndo2db exists.
if [ ! -f $Ndo2dbBin ]; then
    echo "Executable file $Ndo2dbBin not found.  Exiting."
    exit 1
fi

# Check that ndo2db.cfg exists.
if [ ! -f $Ndo2dbCfgFile ]; then
    echo "Configuration file $Ndo2dbCfgFile not found.  Exiting."
    exit 1
fi

# See how we were called.
case "$1" in

        start)
                status_ndo2db
                if [ $? -eq 0 ]; then
                        echo "$servicename already started..."
                        exit 1
                fi
                echo -n "Starting $servicename:"

                rm -f $Ndo2dbLockDir/$Ndo2dbLockFile
                rm -f $Ndo2dbVarDir/ndo.sock

                touch $Ndo2dbRunFile
                chown $Ndo2dbUser:$Ndo2dbGroup $Ndo2dbRunFile
                $Ndo2dbBin -c $Ndo2dbCfgFile
                if [ -d $Ndo2dbLockDir ]; then
                    touch $Ndo2dbLockDir/$Ndo2dbLockFile;
                    chown $Ndo2dbUser:$Ndo2dbGroup $Ndo2dbLockDir/$Ndo2dbLockFile;
                fi
                echo " done."
                exit 0
                ;;

        stop)
                status_ndo2db
                if ! [ $? -eq 0 ]; then
                        echo "$servicename was not running... could not stop"
                        exit 1
                fi
                echo -n "Stopping $servicename: "

                pid_ndo2db
                killproc_ndo2db ndo2db

                # now we have to wait for ndo2db to exit and remove its
                # own Ndo2dbRunFile, otherwise a following "start" could
                # happen, and then the exiting ndo2db will remove the
                # new Ndo2dbRunFile, allowing multiple ndo2db daemons
                # to (sooner or later) run - John Sellens
                #echo -n 'Waiting for ndo2db to exit .'
                for i in 1 2 3 4 5 6 7 8 9 10 ; do
                    if status_ndo2db > /dev/null; then
                        echo -n '.'
                        sleep 1
                    else
                        break
                    fi
                done
                if status_ndo2db > /dev/null; then
                    echo ''
                    echo 'Warning - $servicename did not exit in a timely manner'
                else
                    echo 'done.'
                fi

                rm -f $Ndo2dbStatusFile $Ndo2dbRunFile $Ndo2dbLockDir/$Ndo2dbLockFile $Ndo2dbCommandFile
                ;;

        status)
                printstatus_ndo2db
                ;;

        restart)
                $0 stop
                $0 start
                ;;

        *)
                echo "Usage: $servicename {start|stop|restart|status}"
                exit 1
                ;;

esac

# End of this script
jwelch
Posts: 225
Joined: Wed Sep 05, 2012 12:49 pm

Re: After upgrade from 5.3.4 to 5.4.0, Database Backend erro

Post by jwelch »

probably easier for you do pull it and do a diff. When I cut and pasted from your post, it included a bunch of whitespace.
See attached /etc/init.d/ndo2db file (had to rename it to ndo2db.txt so I could attach it.
ndo2db.txt
You do not have the required permissions to view the files attached to this post.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: After upgrade from 5.3.4 to 5.4.0, Database Backend erro

Post by dwhitfield »

yeah, a lot of white space differences, but I think that's it.

What's your output of ll /usr/lib/systemd?

Also, this does appear to be only a cosmetic issue. We're happy to work through it so you aren't getting a false positive, but know that it's just cosmetic, I guess you need to decide how much time you want to spend on the issue.
Locked