Page 1 of 2

Issues with Capacity Planning

Posted: Wed Jun 05, 2019 6:47 am
by bomahony
Hey folks

I haevnt done much with the CP aspect myself, but my boss seems to be having issues. I reproduced myself.
Also, we have ~20K checks on ~400 hosts. Loading the reports page is super slow. Is there any way to get it to default to ZERO nodes?

The issue he is seeing:
1. reports -> capacity planning
2. select hostgroup: HOSTIN
3. type CPU in the search box and click Run
4. 24 page report will be generated - when clicking download -> pdf my account becomes unusable

Any idea how to speed this up?
Is there a method for running these Capacity Planning reports from the CLI?
Is it possible to collapse all nodes for a hostgroup into a single pain of glass - eg "CPU_USER" for all 30 nodes as one metric?

Re: Issues with Capacity Planning

Posted: Wed Jun 05, 2019 2:06 pm
by lmiltchev
Loading the reports page is super slow. Is there any way to get it to default to ZERO nodes?
Yes, go to Admin > System Config > Performance Settings > Auto-Running > Disable reports from automatically running on page load = checked, and click on "Update Settings".
The issue he is seeing:
1. reports -> capacity planning
2. select hostgroup: HOSTIN
3. type CPU in the search box and click Run
4. 24 page report will be generated - when clicking download -> pdf my account becomes unusable

Any idea how to speed this up?

We need to find out what is causing the slowness in the first place. I am not having such a large system, but loading a 12 page capacity planning report didn't take very long for me.

What is the version of the Nagios XI that you are currently using? What's the hardware like on this system - CPU, Memory, HDD, etc.? Are you using ramdisk? Is MySQL offloaded to a remote server?

Do you see any errors in the apache error log, after you try to load the report?

Code: Select all

tail -100 /var/log/httpd/error_log
Do you see any errors in the wkhtmltox.log?

Code: Select all

tail -100 /usr/local/nagiosxi/var/wkhtmltox.log
What is the time period that you are using in your report, e.g. 1 week, 2 week, 1 month, etc.?
Is there a method for running these Capacity Planning reports from the CLI?
No, not that I am aware of.
Is it possible to collapse all nodes for a hostgroup into a single pain of glass - eg "CPU_USER" for all 30 nodes as one metric?
No, we can't make all nodes metrics into one single metric.

Re: Issues with Capacity Planning

Posted: Wed Jul 10, 2019 9:02 am
by bomahony
Hey folks. I havent had much chance to revisit this due to time constraints.I will provided the required information later this week.

Basically the issue we are finding is with ~400 servers and 20,000 checks per XI instance we are having issues.The perf graphs are fine for trying to troubleshoot an issue, but really arent particularily useful/useable for the dashes we want to provide to other teams and other applications.

Having had a bit of a read, I wanted to try Nagflux/InfluxDB/Grafana, with InfluxDB and Grafana on another VM. However your documentation says it will break XI.
I then found this:
https://support.nagios.com/forum/viewto ... t=influxdb

I was reading ssax's final post and I was unsure whether that will duplicate the perfdata [and send to influxdb] or replace the XI interface perfdata with having to go to the InfuxDB/Grafana nodes?

Re: Issues with Capacity Planning

Posted: Wed Jul 10, 2019 12:02 pm
by bomahony
I had a look and it duplicates the perfdata, I am going to give it a lash in the AM.

Re: Issues with Capacity Planning

Posted: Wed Jul 10, 2019 2:34 pm
by ssax
That is correct, it duplicates the data (still uses the RRDs) because Nagios XI uses the RRDs for its own reporting functionality.

Re: Issues with Capacity Planning

Posted: Thu Jul 11, 2019 4:50 am
by bomahony
Thanks SSAX, having a look this AM.

Is it possible to replay all the historical rrd data without f**king up the XI bits?

Are you guys planning on changing the dash/graphing back end at any time soon?

Re: Issues with Capacity Planning

Posted: Thu Jul 11, 2019 10:43 am
by bomahony
I have this working, but not histou currently. That is tomorrow-me's problem.

Been a while since I did grafana syntax queries also so ill have to dig around in the grey matter for that also!

Re: Issues with Capacity Planning

Posted: Thu Jul 11, 2019 5:02 pm
by ssax
Let's us know if you see any issues.

Re: Issues with Capacity Planning

Posted: Fri Jul 12, 2019 10:13 am
by bomahony
Havent gotten back to this with Histou yet - will be next week.

Is there a way to replay all the old logged data to the new perf folder?

Re: Issues with Capacity Planning

Posted: Fri Jul 12, 2019 3:33 pm
by ssax
There's not currently a way that I'm aware of to migrate the old perfdata as it's stored differently than the data that's now being written to that new directory.