High Cpu load

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
op-team
Posts: 50
Joined: Fri Jun 02, 2017 6:19 am

High Cpu load

Post by op-team »

Hi Guys,

Once again we have experienced a cpu overload on our nagios server (4 cpu and 32GB ram). The server was hanging and the monitoring not working as expected:
cpuload.png
totalProcesses.png
The main resource consumer was nagios daemon.
atop.PNG
We first tried to restart the services as follows
service nagios stop
service ndo2db restart
service nagios start

At the end we have rebooted the server to get back the normal operativity

Could you help us to find out the root cause ?

Please find attached my profile file.
CPU: 4
RAM: 32 GB
You do not have the required permissions to view the files attached to this post.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: High Cpu load

Post by dwhitfield »

Profile did not attach. Can you try again?

Also, have you taken a look at https://assets.nagios.com/downloads/nag ... ios-XI.pdf ?
op-team
Posts: 50
Joined: Fri Jun 02, 2017 6:19 am

Re: High Cpu load

Post by op-team »

My profile file:
profile.zip
You do not have the required permissions to view the files attached to this post.
op-team
Posts: 50
Joined: Fri Jun 02, 2017 6:19 am

Re: High Cpu load

Post by op-team »

Hi,

Most of the time, the system load remains under control:
server_stats.png
engine_status.png
We weren't having a major unexpected event (issue causing a lot of check retries) when the problem happened.
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: High Cpu load

Post by tgriep »

The System Profile only capture a small window of data from the log files so there wasn't any errors in it to debug the issue.
If you can check the log files in the following folder to see if there are any errors at that time and if you need help, post them here.

Code: Select all

/var/log/
Also, the Nagios archived log files may have some errors and those can be found here.

Code: Select all

/usr/local/nagios/var/archives/
You can post those as well.
Be sure to check out our Knowledgebase for helpful articles and solutions!
op-team
Posts: 50
Joined: Fri Jun 02, 2017 6:19 am

Re: High Cpu load

Post by op-team »

Hi,

thanks for your replies.

No relevant errors in nagios log while i found the folllowing in messages logs:

Jul 9 05:30:04 nagios ndo2db: Error: mysql_query() failed for 'UPDATE nagios_conninfo SET disconnect_time=NOW(), last_checkin_time=NOW(), data_end_time=FROM_UNIXTIME(0), bytes_processed='0', lines_processed='0', entries_processed='0' WHERE conninfo_id='0''
Jul 9 05:30:04 nagios ndo2db: Error: Connection to MySQL database has been lost!
Jul 10 05:30:03 nagios ndo2db: Error: mysql_query() failed for 'UPDATE nagios_conninfo SET disconnect_time=NOW(), last_checkin_time=NOW(), data_end_time=FROM_UNIXTIME(0), bytes_processed='0', lines_processed='0', entries_processed='0' WHERE conninfo_id='0''
Jul 10 05:30:03 nagios ndo2db: Error: Connection to MySQL database has been lost!
Jul 11 05:30:03 nagios ndo2db: Error: mysql_query() failed for 'UPDATE nagios_conninfo SET disconnect_time=NOW(), last_checkin_time=NOW(), data_end_time=FROM_UNIXTIME(0), bytes_processed='0', lines_processed='0', entries_processed='0' WHERE conninfo_id='0''
Jul 11 05:30:03 nagios ndo2db: Error: Connection to MySQL database has been lost!
Jul 12 05:30:04 nagios ndo2db: Error: mysql_query() failed for 'UPDATE nagios_conninfo SET disconnect_time=NOW(), last_checkin_time=NOW(), data_end_time=FROM_UNIXTIME(0), bytes_processed='0', lines_processed='0', entries_processed='0' WHERE conninfo_id='0''
Jul 12 05:30:04 nagios ndo2db: Error: Connection to MySQL database has been lost!
Jul 12 13:59:40 nagios ndo2db: Error: mysql_query() failed for 'UPDATE nagios_conninfo SET disconnect_time=NOW(), last_checkin_time=NOW(), data_end_time=FROM_UNIXTIME(0), bytes_processed='0', lines_processed='0', entries_processed='0' WHERE conninfo_id='0''
Jul 12 13:59:40 nagios ndo2db: Error: Connection to MySQL database has been lost!
Jul 13 05:30:04 nagios ndo2db: Error: mysql_query() failed for 'UPDATE nagios_conninfo SET disconnect_time=NOW(), last_checkin_time=NOW(), data_end_time=FROM_UNIXTIME(0), bytes_processed='0', lines_processed='0', entries_processed='0' WHERE conninfo_id='0''
Jul 13 05:30:04 nagios ndo2db: Error: Connection to MySQL database has been lost!
Jul 13 13:44:53 nagios ndo2db: Error: mysql_query() failed for 'UPDATE nagios_conninfo SET disconnect_time=NOW(), last_checkin_time=NOW(), data_end_time=FROM_UNIXTIME(0), bytes_processed='0', lines_processed='0', entries_processed='0' WHERE conninfo_id='0''
Jul 13 13:44:53 nagios ndo2db: Error: Connection to MySQL database has been lost!
Jul 14 05:30:04 nagios ndo2db: Error: mysql_query() failed for 'UPDATE nagios_conninfo SET disconnect_time=NOW(), last_checkin_time=NOW(), data_end_time=FROM_UNIXTIME(0), bytes_processed='0', lines_processed='0', entries_processed='0' WHERE conninfo_id='0''
Jul 14 05:30:04 nagios ndo2db: Error: Connection to MySQL database has been lost!
Jul 18 05:30:04 nagios ndo2db: Error: mysql_query() failed for 'UPDATE nagios_conninfo SET disconnect_time=NOW(), last_checkin_time=NOW(), data_end_time=FROM_UNIXTIME(0), bytes_processed='0', lines_processed='0', entries_processed='0' WHERE conninfo_id='0''
Jul 18 05:30:04 nagios ndo2db: Error: Connection to MySQL database has been lost!
Jul 19 05:30:04 nagios ndo2db: Error: mysql_query() failed for 'UPDATE nagios_conninfo SET disconnect_time=NOW(), last_checkin_time=NOW(), data_end_time=FROM_UNIXTIME(0), bytes_processed='0', lines_processed='0', entries_processed='0' WHERE conninfo_id='0''
Jul 19 05:30:04 nagios ndo2db: Error: Connection to MySQL database has been lost!
Jul 20 05:30:04 nagios ndo2db: Error: mysql_query() failed for 'UPDATE nagios_conninfo SET disconnect_time=NOW(), last_checkin_time=NOW(), data_end_time=FROM_UNIXTIME(0), bytes_processed='0', lines_processed='0', entries_processed='0' WHERE conninfo_id='0''
Jul 20 05:30:04 nagios ndo2db: Error: Connection to MySQL database has been lost!

The Mysql max connections allowed:
db_max_connections.PNG
The used connections never exceed the maximum connections allowed
max_connections.PNG
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: High Cpu load

Post by tgriep »

In the messages log, the error is happening most of the time at 5:30:04 in the morning.
If the MYSQL database is not running, that could cause that error, it the MYSQL database getting restarted every morning or that the repair script is being run at that time?

Without any other errors, there isn't much clues on why the issue happened.
Be sure to check out our Knowledgebase for helpful articles and solutions!
op-team
Posts: 50
Joined: Fri Jun 02, 2017 6:19 am

Re: High Cpu load

Post by op-team »

Hi guys,

The problem occurred one more time yesterday morning from 2h34 AM

I noticed in messages log that the issue began with the following error "2017-08-06 02:34:02 Warning: A system time change of 9243 seconds (0d 2h 34m 3s forwards in time) has been detected. Compensating... "
messages.PNG
Please have a look to my profile file
profile.zip

Thanks
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: High Cpu load

Post by tgriep »

That could be the reason for the load. It the system changes time, it would reschedule the checks and cause more load.
Take a look at the settings for the ntp daemon and see if you can find out why the time changes.

I also see a lot of scripts running ftp's to various servers that are run out of cron.
With the time change, that would reset those as well causing them to rerun and that would cause the load to increase.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked