Page 2 of 2

Re: Load average is high on the Nagios server

Posted: Thu Aug 01, 2019 6:55 am
by IT-OPS-SYS
i see the below error in the mariadb logs:

90714 5:03:40 [ERROR] mysqld: Table './nagiosxi/xi_usermeta' is marked as crashed and should be repaired
190714 5:03:40 [Warning] Checking table: './nagiosxi/xi_usermeta'
190714 5:03:40 [ERROR] mysqld: Table './nagiosxi/xi_users' is marked as crashed and should be repaired
190714 5:03:40 [Warning] Checking table: './nagiosxi/xi_users'
190714 5:03:40 [ERROR] mysqld: Table './nagios/nagios_programstatus' is marked as crashed and should be repaired
190714 5:03:40 [ERROR] mysqld: Table './nagios/nagios_programstatus' is marked as crashed and should be repaired
190714 5:03:40 [ERROR] mysqld: Table './nagios/nagios_programstatus' is marked as crashed and should be repaired
190714 5:03:40 [ERROR] mysqld: Table './nagios/nagios_programstatus' is marked as crashed and should be repaired
190714 5:03:40 [Warning] Checking table: './nagios/nagios_programstatus'
190714 5:03:40 [ERROR] mysqld: Table './nagios/nagios_hoststatus' is marked as crashed and should be repaired
190714 5:03:40 [ERROR] mysqld: Table './nagios/nagios_hoststatus' is marked as crashed and should be repaired
190714 5:03:40 [ERROR] mysqld: Table './nagios/nagios_hoststatus' is marked as crashed and should be repaired
190714 5:03:40 [ERROR] mysqld: Table './nagios/nagios_hoststatus' is marked as crashed and should be repaired
190714 5:03:40 [ERROR] mysqld: Table './nagios/nagios_hoststatus' is marked as crashed and should be repaired
190714 5:03:40 [Warning] Checking table: './nagios/nagios_hoststatus'
190714 5:03:41 [ERROR] mysqld: Table './nagiosxi/xi_sysstat' is marked as crashed and should be repaired
190714 5:03:41 [Warning] Checking table: './nagiosxi/xi_sysstat'
190714 5:04:03 [ERROR] mysqld: Table './nagiosxi/xi_meta' is marked as crashed and should be repaired
190714 5:04:03 [Warning] Checking table: './nagiosxi/xi_meta'
190714 5:04:03 [ERROR] Got an error from unknown thread, /builddir/build/BUILD/mariadb-5.5.60/storage/myisam/ha_myisam.cc:936
190714 5:04:03 [Warning] Recovering table: './nagiosxi/xi_meta'
190714 5:04:03 [ERROR] mysqld: Table './nagios/nagios_conninfo' is marked as crashed and should be repaired
190714 5:04:03 [Warning] Checking table: './nagios/nagios_conninfo'
190714 5:04:03 [Note] Found 5018 of 5033 rows when repairing './nagiosxi/xi_meta'
190714 5:04:03 [ERROR] mysqld: Table './nagiosxi/xi_eventqueue' is marked as crashed and should be repaired
190714 5:04:03 [Warning] Checking table: './nagiosxi/xi_eventqueue'
190714 5:04:04 [ERROR] mysqld: Table './nagiosxi/xi_events' is marked as crashed and should be repaired
190714 5:04:04 [Warning] Checking table: './nagiosxi/xi_events'
190714 5:04:04 [ERROR] mysqld: Table './nagiosxi/xi_commands' is marked as crashed and should be repaired
190714 5:04:04 [Warning] Checking table: './nagiosxi/xi_commands'
190714 5:04:05 [ERROR] mysqld: Table './nagios/nagios_servicestatus' is marked as crashed and should be repaired
190714 5:04:05 [Warning] Checking table: './nagios/nagios_servicestatus'
190714 5:05:11 [ERROR] mysqld: Table './nagios/nagios_logentries' is marked as crashed and should be repaired
190714 5:05:11 [Warning] Checking table: './nagios/nagios_logentries'
190714 5:05:22 [ERROR] mysqld: Table './nagios/nagios_statehistory' is marked as crashed and should be repaired
190714 5:05:22 [Warning] Checking table: './nagios/nagios_statehistory'
190714 5:05:27 [ERROR] mysqld: Table './nagios/nagios_systemcommands' is marked as crashed and should be repaired
190714 5:05:27 [Warning] Checking table: './nagios/nagios_systemcommands'
190714 5:05:27 [ERROR] mysqld: Table './nagios/nagios_eventhandlers' is marked as crashed and should be repaired
190714 5:05:27 [Warning] Checking table: './nagios/nagios_eventhandlers'
190714 5:11:03 [ERROR] mysqld: Table './nagios/nagios_customvariablestatus' is marked as crashed and should be repaired
190714 5:11:03 [Warning] Checking table: './nagios/nagios_customvariablestatus'

Re: Load average is high on the Nagios server

Posted: Thu Aug 01, 2019 3:11 pm
by tgriep
Those errors are a couple of weeks old but did you run a database repair since then?

If not, do this as root to repair all of the databases.

Code: Select all

mysqlcheck -f -r -u root -pnagiosxi --all-databases --use-frm
I also see some stuck cron jobs so lets stop and start it on the server by running this.

Code: Select all

systemctl stop crond
killall -9 crond
systemctl start crond
The rest of the logs look OK to me.
Did you increase the max connections for the MYSQL database?

Re: Load average is high on the Nagios server

Posted: Mon Aug 05, 2019 6:46 am
by IT-OPS-SYS
if i try to repair the database, will it affect any monitoring in the production or not.

Re: Load average is high on the Nagios server

Posted: Mon Aug 05, 2019 6:57 am
by IT-OPS-SYS
there is no command like this : killall -9 crond

we have killall5 command in version 5.6

Re: Load average is high on the Nagios server

Posted: Mon Aug 05, 2019 8:55 am
by tgriep
When running the database repair, the system should continue to run and monitor hosts and services.
The killall5 command is different than the killall command. If you install the following package on your system, that will install the killall command so you can run in.
psmisc
Else, it you do not want to install it, stop the crond daemon and make sure all of them are stopped and if not, manually stop them.

Re: Load average is high on the Nagios server

Posted: Mon Aug 12, 2019 9:16 am
by IT-OPS-SYS
after doing all this we are still getting the high cpu usage on the nagios xi server.

Re: Load average is high on the Nagios server

Posted: Mon Aug 12, 2019 2:40 pm
by tgriep
Is the high load constant on the server or is it intermittent?

What application / daemon is causing the highest load?

If you run the following command ad root, it will run the top command. 60 times with a 60 second delay and output the data to the /tmp/top.txt file.

Code: Select all

top -b -n 60 -d 60 >>/tmp/top.txt
The command will take an hour to run so run it and upload the /tmp/top.txt file here so we can view it.

Could you post your Nagios XI System Profile so we can review it?
To get your system profile. Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and upload it to the forum post or PM it to me.

Thanks.

Re: Load average is high on the Nagios server

Posted: Tue Aug 13, 2019 5:18 am
by IT-OPS-SYS
i have already sent the full status report to you before as well.

the process which is taking more time is the vmware api.

Re: Load average is high on the Nagios server

Posted: Tue Aug 13, 2019 10:33 am
by tgriep
We don't have access to the number of VMWare Hosts that you are monitoring so it is difficult to recreate the issue.
There are so many VMWare checks running on the system and there will always be a few running at one time.
I see that you have the box293_check_vmware in the systems commands and if you can use that plugin, that would run the checks on a remote vMA server and that will decrease the load on the Nagios server, is that an option?

Another option is to increase the check interval so there will not be so many of the checks running at once.

One more thing is to find a alternative plugin that uses less resources.
See if you can find one on the exchange site.
https://exchange.nagios.org/