Restarting Nagios

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
bosecorp
Posts: 929
Joined: Thu Jun 26, 2014 1:00 pm

Restarting Nagios

Post by bosecorp »

Whenever we apply new config, the Nagios service gets restarted.

But we have seen in the past that when the nagios service gets restarted, the nagios service does not gets correctly restarted
This results in more process running and then Nagios deamon eventually crashes.

in the init script /etc/init.d/nagios we can see that the function to kill nagios is as below

killproc_nagios ()
{
kill -s "$1" $NagiosPID
}

Ideally we should see only 2 process running when we grep the below

# ps -ef | grep nagios.cfg
root 4367 3975 0 10:24 pts/0 00:00:00 grep nagios.cfg
nagios 18181 1 35 08:50 ? 00:33:25 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 18351 18181 0 08:50 ? 00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg

We tried updating the init script to "killall -9 nagios", but it did not help.

So whenever the nagios service is restarted, we need to make sure that all the pids shown above (18181, 18351 ) are killed first and then the start script should kick in.

Nagios version : 5.2.7
bwallace
Posts: 1145
Joined: Tue Nov 17, 2015 1:57 pm

Re: Restarting Nagios

Post by bwallace »

What OS/ version is your Nagios server running?
Be sure to check out the Knowledgebase for helpful articles and solutions!
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Restarting Nagios

Post by ssax »

Note: If you kill the nagios process while it's in the middle of doing some DB manipulation you may get crashed tables, keep an eye on it.

The killall -9 should work:

Code: Select all

                for i in 1 2 3 4 5 6 7 8 9 10 ; do
                        if status_nagios > /dev/null; then
                                echo -n '.'
                                sleep 1
                        else
                                break
                        fi
                done
                if status_nagios > /dev/null; then
                        echo ''
                        echo 'Warning - nagios did not exit in a timely manner, killing.'
                        killall -9q nagios
                else
                        echo ' done.'
                fi
I would increase the timeout though just in case:

Code: Select all

for i in {1..30} ; do
Locked