High Load on NagiosXI server

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Dusan.Mandic
Posts: 60
Joined: Mon Apr 06, 2020 2:30 pm

High Load on NagiosXI server

Post by Dusan.Mandic »

Hello,

i am in the process of upgrading Nagios NRPE agent on all of our monitored hosts to 4.0.2. We recently updated our NagiosXI server to 5.7.1 (NRPE 4.0.3 plugin) and wanted to mitigate all the logging errors. Also, recently I had run the ramdisk script as we were getting file bloat from the servicedata file (ballooned to ~130 GB). I concatenated /dev/null to that file to reclaim space on our server, but now am getting some wild LOAD and MAX SERVICE LATENCY

Noticed about 117% CPU utilization from mysqld in top on the server. Does it just take a while to reoptimize?
You do not have the required permissions to view the files attached to this post.
dchurch
Posts: 858
Joined: Wed Oct 07, 2020 12:46 pm
Location: Yo mama

Re: High Load on NagiosXI server

Post by dchurch »

Hi!

Since Nagios XI 5.7.1, we've found and fixed a bug that under-utilized an index that lead to poor MySQL performance especially prominent on long-running systems and systems that have a lot of service / host checks. If you're not opposed to upgrading yet again, 5.8.2 is out now with some performance improvements.

Nagios XI Change Log:
Nagios XI 5.8.2:
- NDO 3.0.6:
- - Increased performance for queries involving comment history and downtimes on large/long-running systems
Read on only if you don't want to update

In lieu of updating, we can do some things to fix some of the larger tables to help mitigate the database performance hit. Run this command:

Code: Select all

echo "select table_name as 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' from information_schema.TABLES where table_schema in ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.

Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
Dusan.Mandic
Posts: 60
Joined: Mon Apr 06, 2020 2:30 pm

Re: High Load on NagiosXI server

Post by Dusan.Mandic »

Here you are
You do not have the required permissions to view the files attached to this post.
dchurch
Posts: 858
Joined: Wed Oct 07, 2020 12:46 pm
Location: Yo mama

Re: High Load on NagiosXI server

Post by dchurch »

Your xi_auditlog table is over 2GB in size, which could be leading to some slowdown. This could be due to the database maintenance task not automatically running, so let's check on that.

What are the output from the following commands?

Code: Select all

mysql -unagiosxi -pn@gweb nagiosxi <<< 'select min(log_time) from xi_auditlog;'
mysql -unagiosxi -pn@gweb nagiosxi <<< "select * from xi_sysstat where metric = 'dbmaint'"
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.

Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
Dusan.Mandic
Posts: 60
Joined: Mon Apr 06, 2020 2:30 pm

Re: High Load on NagiosXI server

Post by Dusan.Mandic »

[xxx@xxx~]$ mysql -unagiosxi -pn@gweb nagiosxi <<< "select * from xi_sysstat where metric = 'dbmaint'"
sysstat_id metric value update_time
1 dbmaint a:1:{s:10:"last_check";i:1616018701;} 2021-03-17 17:05:01

[xxx@xxx~]$ sudo mysql -unagiosxi -pn@gweb nagiosxi <<< 'select min(log_time) from xi_auditlog;'
min(log_time)
2020-09-19 02:00:01
dchurch
Posts: 858
Joined: Wed Oct 07, 2020 12:46 pm
Location: Yo mama

Re: High Load on NagiosXI server

Post by dchurch »

Open /usr/local/nagiosxi/html/config.inc.php and around line 40, change "max_auditlog_age" => 180, to "max_auditlog_age" => 30,

For example:

Code: Select all

$cfg['db_info'] = array(
    "nagiosxi" => array(
        "dbtype" => 'mysql',
        "dbserver" => '',
        "user" => 'nagiosxi',
        "db" => 'nagiosxi',
        "charset" => "utf8",
        "dbmaint" => array( // variables affecting maintenance of db
            "max_auditlog_age" => 30, // max time (in DAYS) to keep audit log entries
Then save the file.
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.

Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
Dusan.Mandic
Posts: 60
Joined: Mon Apr 06, 2020 2:30 pm

Re: High Load on NagiosXI server

Post by Dusan.Mandic »

Done.

How long will this take to pare down? Still shows around ~2GB

xi_auditlog | 2111.92
dchurch
Posts: 858
Joined: Wed Oct 07, 2020 12:46 pm
Location: Yo mama

Re: High Load on NagiosXI server

Post by dchurch »

It should run daily. You'll know if it's run if you run this command and it returns a date less than or equal to 30 days ago.

Code: Select all

mysql -unagiosxi -pn@gweb nagiosxi <<< 'select min(log_time) from xi_auditlog;'
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.

Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
Dusan.Mandic
Posts: 60
Joined: Mon Apr 06, 2020 2:30 pm

Re: High Load on NagiosXI server

Post by Dusan.Mandic »

[xxx@xxx ~]$ mysql -unagiosxi -pn@gweb nagiosxi <<< 'select min(log_time) from xi_auditlog;'
min(log_time)
2020-09-24 02:00:01

Doesn't look like its run since last September?
dchurch
Posts: 858
Joined: Wed Oct 07, 2020 12:46 pm
Location: Yo mama

Re: High Load on NagiosXI server

Post by dchurch »

Huh. The automatic process to delete old entries from that table seems to be not running. It could be that the database can't run, but I won't know that until we do some more investigating.

Try running the database repair script, and let me know if that is successful. Run the following as root from the terminal.

Code: Select all

/usr/local/nagiosxi/scripts/repair_databases.sh
See here for complete instructions: run the database repair

If that doesn't fix it, please PM me a profile. Get one by going to Admin (top menu) => System Profile (in the left menu), then clicking the blue button.

If you're unable to generate the the profile through the web interface, please try generating it from the command line by running these commands as root:

Code: Select all

rm -rf /usr/local/nagiosxi/var/components/profile*
/usr/local/nagiosxi/scripts/components/getprofile.sh SUPPORT
Then send me the resulting /usr/local/nagiosxi/var/components/profile.zip file.
If the profile script fails, please include the ENTIRE output.
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.

Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
Locked