Page 3 of 9

Re: ndo2db Hogging ALL the CPU

Posted: Wed Sep 17, 2014 4:49 pm
by abrist
1) Change:

Code: Select all

auto_reschedule_checks=1
auto_rescheduling_interval=45
auto_rescheduling_window=180
To:

Code: Select all

auto_reschedule_checks=1
auto_rescheduling_interval=30
auto_rescheduling_window=45
2) Lets check the size of your tables:

Code: Select all

ls -lahS /var/lib/mysql/nagios | head -15
3) Check the db for crashed tables:

Code: Select all

tail -100 /var/log/mysqld.log | grep crashed

Re: ndo2db Hogging ALL the CPU

Posted: Wed Sep 17, 2014 5:23 pm
by mikew
I made the modifications on the auto_rescheduling_window and it looks like that did it!!!!!!!

Here is a look at top:

Code: Select all

top - 16:21:12 up 14 days,  4:20,  1 user,  load average: 0.64, 0.47, 0.40
Tasks: 237 total,   1 running, 236 sleeping,   0 stopped,   0 zombie
Cpu(s):  8.1%us,  1.6%sy,  0.0%ni, 90.2%id,  0.0%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:  16325748k total, 14014816k used,  2310932k free,   208848k buffers
Swap:  4194296k total,        0k used,  4194296k free, 12739392k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
17498 apache    20   0  339m  22m 3748 S  8.3  0.1   0:00.25 httpd              
15360 apache    20   0  437m  26m 4348 S  6.0  0.2   0:01.44 httpd              
15033 root      20   0  139m  46m 3084 S  1.7  0.3   0:06.06 mrtg               
15366 root      20   0 15164 1348  944 R  0.7  0.0   0:00.12 top   

Thanks for sticking with the problem...I will watch for several days and report back as others may want to see the results.


Here are the table stats:

Code: Select all

 ls -lahs /var/lib/mysql/nagios | head -15
total 1.1G
 12K drwx------ 2 mysql mysql  12K Sep 17 15:15 .
4.0K drwxr-xr-x 6 mysql mysql 4.0K Sep  3 12:01 ..
4.0K -rw-rw---- 1 mysql mysql   61 Jul 17 18:01 db.opt
 12K -rw-rw---- 1 mysql mysql 8.9K Jul 17 18:01 nagios_acknowledgements.frm
4.0K -rw-rw---- 1 mysql mysql  212 Sep 12 14:29 nagios_acknowledgements.MYD
4.0K -rw-rw---- 1 mysql mysql 3.0K Sep 17 15:15 nagios_acknowledgements.MYI
 12K -rw-rw---- 1 mysql mysql 8.6K Jul 17 18:01 nagios_commands.frm
 12K -rw-rw---- 1 mysql mysql  12K Sep 17 08:50 nagios_commands.MYD
8.0K -rw-rw---- 1 mysql mysql 7.0K Sep 17 15:15 nagios_commands.MYI
 12K -rw-rw---- 1 mysql mysql 9.2K Jul 17 18:01 nagios_commenthistory.frm
176K -rw-rw---- 1 mysql mysql 173K Sep 17 08:50 nagios_commenthistory.MYD
 20K -rw-rw---- 1 mysql mysql  20K Sep 17 15:15 nagios_commenthistory.MYI
 12K -rw-rw---- 1 mysql mysql 9.1K Jul 17 18:01 nagios_comments.frm
4.0K -rw-rw---- 1 mysql mysql  516 Sep 17 08:50 nagios_comments.MYD


tail -100 /var/log/mysqld.log |grep crashed
You have new mail in /var/spool/mail/root

Re: ndo2db Hogging ALL the CPU

Posted: Wed Sep 17, 2014 5:28 pm
by BanditBBS
That's real cool you hopefully found that issue linked to one setting. Now hopefully it can be determined why :)

Re: ndo2db Hogging ALL the CPU

Posted: Wed Sep 17, 2014 5:33 pm
by mrochelle
From Marcus, what would be the recommended maximums for tables and recommended corrective maintenance?
Follow are my tables which are becoming quite large.

Code: Select all

[root@nagprod01 mrochelle]# ls -lahS /var/lib/mysql/nagios | head -15
total 2.1G
-rw-rw----  1 mysql mysql  978M Sep 17 17:23 nagios_statehistory.MYD
-rw-rw----  1 mysql mysql  589M Sep 17 17:22 nagios_logentries.MYD
-rw-rw----  1 mysql mysql  310M Sep 17 17:22 nagios_logentries.MYI
-rw-rw----  1 mysql mysql  159M Sep 17 17:23 nagios_statehistory.MYI
-rw-rw----  1 mysql mysql  4.4M Sep 11 14:05 nagios_flappinghistory.MYD
-rw-rw----  1 mysql mysql  3.9M Sep 17 17:23 nagios_servicestatus.MYD
-rw-rw----  1 mysql mysql  3.1M Sep 17 16:50 nagios_commenthistory.MYD
-rw-rw----  1 mysql mysql  2.0M Sep 17 17:23 nagios_servicestatus.MYI
-rw-rw----  1 mysql mysql  1.9M Sep 17 14:45 nagios_services.MYD
-rw-rw----  1 mysql mysql  1.5M Sep 17 14:02 nagios_objects.MYD
-rw-rw----  1 mysql mysql  1.3M Sep 17 16:50 nagios_objects.MYI
-rw-rw----  1 mysql mysql  1.2M Sep 17 17:18 nagios_contactnotifications.MYI
-rw-rw----  1 mysql mysql  1.1M Sep 17 17:23 nagios_hoststatus.MYD
-rw-rw----  1 mysql mysql 1001K Sep 17 17:18 nagios_contactnotificationmethods.MYI
:o

Re: ndo2db Hogging ALL the CPU

Posted: Thu Sep 18, 2014 4:26 pm
by sreinhardt
We don't really have a suggested maximum. It really depends on the system, environment, and requirements of the users. In some cases such as banks, they need to keep data for many years. This might result in gigabytes of log tables easily. On other systems, it might not be important at all and they clean them up weekly for the performance benefits. Personally, I would clean those up a bit, likely through truncation, but do keep in mind that it will lose any data that was truncated.

Re: ndo2db Hogging ALL the CPU

Posted: Tue Oct 07, 2014 11:45 am
by mikew
Unfortunatley I have now seen this on another server. This server was working fine on 2014R1.3 but when updated to 20141.4 it is now running at 100% for ndo2db. Event queue is all bunched into first 1 minute. Using Gearman and the workers are doing their job. The killer on the system is ndo2db.

CentOs 6.x
20 vCPU with 16 GB of RAM

Re: ndo2db Hogging ALL the CPU

Posted: Tue Oct 07, 2014 12:18 pm
by mrochelle
Your are not alone with your experience. I have a couple of machines that continue to have same problem. I can usually clear the problem for several days by the commands:
service stop nagios
service start nagios
wait a minute, then
service stop ndo2db
service start ndo2db

I also updated to 2014R1.5 with no change.
I'm waiting patiently for a permanent fix from the nagios team. Marcus :geek:

Re: ndo2db Hogging ALL the CPU

Posted: Tue Oct 07, 2014 4:51 pm
by lmiltchev
Your are not alone with your experience. I have a couple of machines that continue to have same problem.
Are you also using ModGearman on the machine you are having troubles with?

Re: ndo2db Hogging ALL the CPU

Posted: Tue Oct 07, 2014 6:22 pm
by mrochelle
No I'm not using ModGearman.

Re: ndo2db Hogging ALL the CPU

Posted: Wed Oct 08, 2014 10:15 am
by lmiltchev
Thanks! This has been a difficult issue to troubleshoot. At least we can exclude ModGearman as one of the possible causes/factors.