Load average is high on the Nagios server

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
IT-OPS-SYS
Posts: 184
Joined: Sun Jan 07, 2018 12:56 pm

Re: Load average is high on the Nagios server

Post by IT-OPS-SYS »

i see the below error in the mariadb logs:

90714 5:03:40 [ERROR] mysqld: Table './nagiosxi/xi_usermeta' is marked as crashed and should be repaired
190714 5:03:40 [Warning] Checking table: './nagiosxi/xi_usermeta'
190714 5:03:40 [ERROR] mysqld: Table './nagiosxi/xi_users' is marked as crashed and should be repaired
190714 5:03:40 [Warning] Checking table: './nagiosxi/xi_users'
190714 5:03:40 [ERROR] mysqld: Table './nagios/nagios_programstatus' is marked as crashed and should be repaired
190714 5:03:40 [ERROR] mysqld: Table './nagios/nagios_programstatus' is marked as crashed and should be repaired
190714 5:03:40 [ERROR] mysqld: Table './nagios/nagios_programstatus' is marked as crashed and should be repaired
190714 5:03:40 [ERROR] mysqld: Table './nagios/nagios_programstatus' is marked as crashed and should be repaired
190714 5:03:40 [Warning] Checking table: './nagios/nagios_programstatus'
190714 5:03:40 [ERROR] mysqld: Table './nagios/nagios_hoststatus' is marked as crashed and should be repaired
190714 5:03:40 [ERROR] mysqld: Table './nagios/nagios_hoststatus' is marked as crashed and should be repaired
190714 5:03:40 [ERROR] mysqld: Table './nagios/nagios_hoststatus' is marked as crashed and should be repaired
190714 5:03:40 [ERROR] mysqld: Table './nagios/nagios_hoststatus' is marked as crashed and should be repaired
190714 5:03:40 [ERROR] mysqld: Table './nagios/nagios_hoststatus' is marked as crashed and should be repaired
190714 5:03:40 [Warning] Checking table: './nagios/nagios_hoststatus'
190714 5:03:41 [ERROR] mysqld: Table './nagiosxi/xi_sysstat' is marked as crashed and should be repaired
190714 5:03:41 [Warning] Checking table: './nagiosxi/xi_sysstat'
190714 5:04:03 [ERROR] mysqld: Table './nagiosxi/xi_meta' is marked as crashed and should be repaired
190714 5:04:03 [Warning] Checking table: './nagiosxi/xi_meta'
190714 5:04:03 [ERROR] Got an error from unknown thread, /builddir/build/BUILD/mariadb-5.5.60/storage/myisam/ha_myisam.cc:936
190714 5:04:03 [Warning] Recovering table: './nagiosxi/xi_meta'
190714 5:04:03 [ERROR] mysqld: Table './nagios/nagios_conninfo' is marked as crashed and should be repaired
190714 5:04:03 [Warning] Checking table: './nagios/nagios_conninfo'
190714 5:04:03 [Note] Found 5018 of 5033 rows when repairing './nagiosxi/xi_meta'
190714 5:04:03 [ERROR] mysqld: Table './nagiosxi/xi_eventqueue' is marked as crashed and should be repaired
190714 5:04:03 [Warning] Checking table: './nagiosxi/xi_eventqueue'
190714 5:04:04 [ERROR] mysqld: Table './nagiosxi/xi_events' is marked as crashed and should be repaired
190714 5:04:04 [Warning] Checking table: './nagiosxi/xi_events'
190714 5:04:04 [ERROR] mysqld: Table './nagiosxi/xi_commands' is marked as crashed and should be repaired
190714 5:04:04 [Warning] Checking table: './nagiosxi/xi_commands'
190714 5:04:05 [ERROR] mysqld: Table './nagios/nagios_servicestatus' is marked as crashed and should be repaired
190714 5:04:05 [Warning] Checking table: './nagios/nagios_servicestatus'
190714 5:05:11 [ERROR] mysqld: Table './nagios/nagios_logentries' is marked as crashed and should be repaired
190714 5:05:11 [Warning] Checking table: './nagios/nagios_logentries'
190714 5:05:22 [ERROR] mysqld: Table './nagios/nagios_statehistory' is marked as crashed and should be repaired
190714 5:05:22 [Warning] Checking table: './nagios/nagios_statehistory'
190714 5:05:27 [ERROR] mysqld: Table './nagios/nagios_systemcommands' is marked as crashed and should be repaired
190714 5:05:27 [Warning] Checking table: './nagios/nagios_systemcommands'
190714 5:05:27 [ERROR] mysqld: Table './nagios/nagios_eventhandlers' is marked as crashed and should be repaired
190714 5:05:27 [Warning] Checking table: './nagios/nagios_eventhandlers'
190714 5:11:03 [ERROR] mysqld: Table './nagios/nagios_customvariablestatus' is marked as crashed and should be repaired
190714 5:11:03 [Warning] Checking table: './nagios/nagios_customvariablestatus'
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Load average is high on the Nagios server

Post by tgriep »

Those errors are a couple of weeks old but did you run a database repair since then?

If not, do this as root to repair all of the databases.

Code: Select all

mysqlcheck -f -r -u root -pnagiosxi --all-databases --use-frm
I also see some stuck cron jobs so lets stop and start it on the server by running this.

Code: Select all

systemctl stop crond
killall -9 crond
systemctl start crond
The rest of the logs look OK to me.
Did you increase the max connections for the MYSQL database?
Be sure to check out our Knowledgebase for helpful articles and solutions!
IT-OPS-SYS
Posts: 184
Joined: Sun Jan 07, 2018 12:56 pm

Re: Load average is high on the Nagios server

Post by IT-OPS-SYS »

if i try to repair the database, will it affect any monitoring in the production or not.
IT-OPS-SYS
Posts: 184
Joined: Sun Jan 07, 2018 12:56 pm

Re: Load average is high on the Nagios server

Post by IT-OPS-SYS »

there is no command like this : killall -9 crond

we have killall5 command in version 5.6
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Load average is high on the Nagios server

Post by tgriep »

When running the database repair, the system should continue to run and monitor hosts and services.
The killall5 command is different than the killall command. If you install the following package on your system, that will install the killall command so you can run in.
psmisc
Else, it you do not want to install it, stop the crond daemon and make sure all of them are stopped and if not, manually stop them.
Be sure to check out our Knowledgebase for helpful articles and solutions!
IT-OPS-SYS
Posts: 184
Joined: Sun Jan 07, 2018 12:56 pm

Re: Load average is high on the Nagios server

Post by IT-OPS-SYS »

after doing all this we are still getting the high cpu usage on the nagios xi server.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Load average is high on the Nagios server

Post by tgriep »

Is the high load constant on the server or is it intermittent?

What application / daemon is causing the highest load?

If you run the following command ad root, it will run the top command. 60 times with a 60 second delay and output the data to the /tmp/top.txt file.

Code: Select all

top -b -n 60 -d 60 >>/tmp/top.txt
The command will take an hour to run so run it and upload the /tmp/top.txt file here so we can view it.

Could you post your Nagios XI System Profile so we can review it?
To get your system profile. Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and upload it to the forum post or PM it to me.

Thanks.
Be sure to check out our Knowledgebase for helpful articles and solutions!
IT-OPS-SYS
Posts: 184
Joined: Sun Jan 07, 2018 12:56 pm

Re: Load average is high on the Nagios server

Post by IT-OPS-SYS »

i have already sent the full status report to you before as well.

the process which is taking more time is the vmware api.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Load average is high on the Nagios server

Post by tgriep »

We don't have access to the number of VMWare Hosts that you are monitoring so it is difficult to recreate the issue.
There are so many VMWare checks running on the system and there will always be a few running at one time.
I see that you have the box293_check_vmware in the systems commands and if you can use that plugin, that would run the checks on a remote vMA server and that will decrease the load on the Nagios server, is that an option?

Another option is to increase the check interval so there will not be so many of the checks running at once.

One more thing is to find a alternative plugin that uses less resources.
See if you can find one on the exchange site.
https://exchange.nagios.org/
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked