I'm trying to run a simple availability report against a hostgroup. From the output of top below, you can see avail.cgi has been running for over 7 minutes, and the Nagios interface is now unresponsive.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
28553 apache 20 0 2084m 1.9g 2384 R 79.3 25.2 7:56.34 avail.cgi
Killing the avail.cgi process will bring things back immediately, but I'm curious as to why this is happening. I'm running Nagios XI 2014R1.5. I don't recall having this issue when I was running the older 2012 version of XI. I have also since migrated from an older CentOS 6.3 32-bit image to the most recent 64-bit CentOS 6.5 image. Aside from this issue, overall performance of Nagios XI was greatly improved after I upgraded.
What steps should I take next in investigating the source of these delays? Is there a known issue with the Availability Report in this version of XI?
I am running the same version as you and do not experience this issue. Couple quick questions....
1. How big is your environment?
2. Do you have Auto-Run of reports shut off?
The reason I ask, if your environment is huge and you go to reports it will autorun and take forever to come back. regardless if you select a hostgroup and hit update it still insists on finishing the first run. If this is the case in your environment, try going to admin - performance settings. Then the auto run tab and disable auto running of reports and try again.
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
I have 963 hosts and 8751 services, so I guess it's a fairly large environment. I'm trying to run a one year report for smaller hostgroups (~30 hosts), and while it does eventually return after 10-15 minutes, I was more concerned that the XI interface in my web broswer (Chrome) goes completely unresponsive for me during that time, regardless if I tried loading it in a new tab. I did just try accessing my instance from a different browser (IE) while the Chrome browser was unresponsive, and Nagios does seem to respond, so this could possibly translate to a browser issue as well. It also explains why I'm not hearing complaints from other users.
BanditBBS wrote:
2. Do you have Auto-Run of reports shut off?
I didn't have this disabled. I've now disabled this, but I don't think that's causing the issue. The first auto-run report finishes quickly as it's default is 24 hours. It becomes a problem when I go out past a few months.
This sounds like a native issue between browsers and javascript, and how we handle ajax. Most likely, I don't do webdev so I would have to ask or look to be 100% sure, when you start the report generation, we begin an ajax loop looking for the data and\or a pdf depending on how you choose to generate it. This ajax loop and javascipt cause hell on your browser while it waits. Chrome seems to handle this better in most cases than IE and firefox, but for large reports like that, I would suggest either scheduling them so that they can be run locally on the xi box and emailed to you, or opening a specific browser session for those long running reports. Browsers tend to not like long running and locking javascipt processes, and I have seen it on many things outside of XI for similar large or long queries locking up my browser.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.