Page 3 of 4

Re: Nagios process at 100% CPU, system is crawling

Posted: Mon Jul 30, 2018 3:28 pm
by scottwilkerson
consulvation wrote:I had top running when I clicked on the Nagios Service Problems link on the left. It took 1m 15s to load the page.
Can you do this again and after a few seconds of waiting take a screen capture of the top window?

Re: Nagios process at 100% CPU, system is crawling

Posted: Mon Jul 30, 2018 4:39 pm
by consulvation
Here you go...
Top
Top

Re: Nagios process at 100% CPU, system is crawling

Posted: Tue Jul 31, 2018 8:09 am
by scottwilkerson
Looks like the load has come down considerably, and the nagios process is not in the list of top processes.

The one item listed is a call the the statusjson.cgi which can be intensive if the call is for historical data

Re: Nagios process at 100% CPU, system is crawling

Posted: Tue Jul 31, 2018 2:42 pm
by consulvation
Well, yes, the nagios process doesn't appear at all in the top anymore. I suppose making it a deamon helped that. But every call to a cgi starts by taking 100% CPU. It still takes over 1 minute to load any page on the site. It's still unusable. We also stopped getting email notifications. I guess I am wondering what else is wrong because if I stop the nagios process, there is a clear increase in performance even from the command line. Do you have any ideas? This did not happen in v3 by the way, this is strictly new to v4. Thanks for all your time on this, I do appreciate it, but this doesn't seem to be solved.

Re: Nagios process at 100% CPU, system is crawling

Posted: Tue Jul 31, 2018 3:28 pm
by consulvation
Also, I should add that it looks like Nagios is stuck in the past. Looking at the status, the last checks were from 4 days ago. Looking at the scheduling queue, it's from 4 days ago.
2018-07-31_16-22-23.png
2018-07-31_16-22-23.png (15.1 KiB) Viewed 3063 times
2018-07-31_16-22-23.png
2018-07-31_16-22-23.png (15.1 KiB) Viewed 3063 times

Re: Nagios process at 100% CPU, system is crawling

Posted: Tue Jul 31, 2018 4:02 pm
by consulvation
Further looking into this, I decided to roll back the changes and move it out of Daemon mode for a second and then I renamed the retention.dat and status.dat files to .old and restarted the nagios process. Those files had gotten to be 2.6GB each. In doing so, the web site is now flying at lightening speeds. I presume this will not last and eventually slow down as the files accumulate. Should they be getting that large?

Re: Nagios process at 100% CPU, system is crawling

Posted: Wed Aug 01, 2018 9:26 am
by scottwilkerson
If this is still running the maint version you may have fixed the problem

Their was a bug in 4.4.1 that duplicated comments etc, which could have been causing your issue
consulvation wrote:Also, I should add that it looks like Nagios is stuck in the past. Looking at the status, the last checks were from 4 days ago. Looking at the scheduling queue, it's from 4 days ago.
Is this resolved as well?

Re: Nagios process at 100% CPU, system is crawling

Posted: Wed Aug 01, 2018 3:48 pm
by consulvation
It seems to be fine so far. The status.dat and retention.dat files seem to be holding in the 800-900K realm, I will continue to see if it increases significantly. I am still running the maint version, just not as a deamon.

Yes, once I renamed the super large status and retention.dat files, it became current and all host checks are being done, notifications are working again as well.

I presume I should be able to install the next stable release without any problems once it becomes available. Thanks again for your help.

Re: Nagios process at 100% CPU, system is crawling

Posted: Wed Aug 01, 2018 4:00 pm
by scottwilkerson
Great Feel free to open a new issue if this changes.

Re: Nagios process at 100% CPU, system is crawling

Posted: Wed Aug 08, 2018 7:37 am
by scottwilkerson
Unlocking per user request