Page 3 of 4

Re: Slowness troubleshooting --> 5.4.11 to 5.4.13.

Posted: Wed Apr 11, 2018 11:00 am
by emartine
Profile sent.
This is RHEL 6.9

At this time I see these:

nagios 1926 1 0 10:46 ? 00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 2080 1926 0 10:46 ? 00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 11928 1 0 Apr10 ? 00:00:10 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 12140 11928 0 Apr10 ? 00:00:03 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 29181 1 0 Apr10 ? 00:00:09 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 29559 29181 0 Apr10 ? 00:00:03 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg

I'll tar up another set of logs in a few.

Re: Slowness troubleshooting --> 5.4.11 to 5.4.13.

Posted: Wed Apr 11, 2018 11:19 am
by emartine
New tar of logs sent.

Re: Slowness troubleshooting --> 5.4.11 to 5.4.13.

Posted: Wed Apr 11, 2018 1:22 pm
by scottwilkerson
I have looked over everything you sent but there is nothing in there that can make me conclusively say why the processes aren't getting killed off like they should be.

What I can say for 100% fact is that there should not be 2 parent processes, and it can adversely affect the system so I am going to recommend making the following change to make sure it doesn't continue


Edit /etc/init.d/nagios on the server and liik for this line

Code: Select all

echo 'Warning - nagios did not exit in a timely manner'
Just below that like I would like you to add the following:

Code: Select all

/usr/bin/killall -9 nagios
Then restart nagios

Code: Select all

service nagios restart

Re: Slowness troubleshooting --> 5.4.11 to 5.4.13.

Posted: Thu Apr 12, 2018 11:33 am
by emartine
I just implemented that. I made a change, hit appy and noticed that checks were not happening. I logged onto the server only to find that nagios was not running.

if status_nagios > /dev/null; then
echo ''
echo 'Warning - nagios did not exit in a timely manner'
/usr/bin/killall -9 nagios
else
echo ' done.'
fi

Re: Slowness troubleshooting --> 5.4.11 to 5.4.13.

Posted: Thu Apr 12, 2018 11:44 am
by emartine
Shortly after removing that line in the stanza I started nagios manually on the command line. I then went into the web interface, made a test service inactive, applied the configuration and now I see this.

ps -ef | grep nagios.cfg
nagios 29128 1 0 11:34 ? 00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 29279 29128 0 11:34 ? 00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 29551 1 0 11:35 ? 00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 29814 29551 0 11:36 ? 00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg

Re: Slowness troubleshooting --> 5.4.11 to 5.4.13.

Posted: Thu Apr 12, 2018 4:33 pm
by scottwilkerson
Something is really strange in the setup that is adds multiple parent processes every time you apply configuration

Can you PM me if you are available between 9am -2pm CDT tomorrow to schedule a time I may take a look at the system.

Thanks.

Re: Slowness troubleshooting --> 5.4.11 to 5.4.13.

Posted: Thu Apr 12, 2018 5:16 pm
by emartine
Will do. I'll PM you around 10AM.

Re: Slowness troubleshooting --> 5.4.11 to 5.4.13.

Posted: Fri Apr 13, 2018 8:27 am
by scottwilkerson
Sounds good. I await hearing from you

Re: Slowness troubleshooting --> 5.4.11 to 5.4.13.

Posted: Fri Apr 13, 2018 10:04 am
by emartine
Im available.

Re: Slowness troubleshooting --> 5.4.11 to 5.4.13.

Posted: Fri Apr 13, 2018 10:40 am
by scottwilkerson
resolved in remote session, added
killproc_nagios KILL

to the init script if the process didn't terminate in a timely manor