--_000_4288A518A157EC4C8873FEE74F778BF0023DF5WPSDGQHHOPRSTATEF_
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
What is interesting is your CPU is 87% idle, which indicates to me that it'=
s waiting for something, or not scheduling the checks correctly. Have you =
tried running in debug mode to see if that indicates anything? Also runnin=
g in debug on just about any of the plugins can cause this too, just in cas=
e you have logging turned up on things like nsca, nrpe, pnp4nagios, etc.
Dan
From: Rodney Ramos [mailto:[email protected]]
Sent: Friday, August 19, 2011 4:44 PM
To: Nagios Developers List
Subject: Re: [Nagios-devel] Nagios and Gearman - huge environment performan=
ce problem
Thanks, Daniel, but I don=B4t think that my problem is of hardware. I creat=
e the ramdisk and the problem is the same:
- nagios eating 100% of CPU all the time;
- nagios does not distribute the active checks in a smoothly way. It waits=
a long time and make the acitve checks in a burst way. I can see this with=
the gearman_top. The gearmand jobs waiting queue is empty almost all the t=
ime, but sometimes there is a burst of jobs in the queue. I can=B4t underst=
and this behavior.
Any help would be great. Thanks everybody.
=3D=3D=3D=3D=3D=3D=3D=3D=3D
Top result
=3D=3D=3D=3D=3D=3D=3D=3D=3D
top - 18:40:59 up 106 days, 16:56, 4 users, load average: 8.52, 6.09, 5.4=
2
Tasks: 215 total, 2 running, 213 sleeping, 0 stopped, 0 zombie
Cpu(s): 12.5%us, 0.1%sy, 0.0%ni, 87.1%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0=
%st
Mem: 4916356k total, 1974976k used, 2941380k free, 163240k buffers
Swap: 4194296k total, 22092k used, 4172204k free, 745100k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2189 nagios 25 0 492m 255m 1668 R 100.1 5.3 66:54.59 nagios
24658 nagios 15 0 561m 116m 676 S 0.7 2.4 62:00.96 gearmand
On Fri, Aug 19, 2011 at 1:31 PM, Daniel Wittenberg > wrote:
Well but look at your bi and bo, and then the wa column. So looks like you=
have some IO Wait which probably means it's waiting on disk activity to ge=
t things done, and lots of writing to disk. Have you looked at adding a ra=
mdisk for your checkresults, status.dat, and temp_file? That should help e=
liminate most of the heavy disk i/o from the nagios perspective. Since it =
doesn't look like you are swapping memory you should be able to throw some =
at a ramdisk. You can probably start with 64MB and watch it, might have to=
go higher depending on your workload.
Dan
From: Rodney Ramos [mailto:[email protected]]
Sent: Friday, August 19, 2011 11:27 AM
To: Nagios Developers List
Subject: Re: [Nagios-devel] Nagios and Gearman - huge environment performan=
ce problem
Hi, Daniel,
As we can see below, I think it is not a hardware problem. The idle CPU is =
beteween 60 and 80 %, very good.
Thank you very much.
$ vmstat 5
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu=
------
r b swpd free buff cache si so bi bo in cs us sy id =
wa st
1 2 22092 3046788 189640 890940 0 0 295 1053 0 0 4 3 83=
10 0
1 2 22092 3032992 189664 904600 0 0 2733 7550 3498 7477 12 1 69=
18 0
1 2 22092 3018240 189668 918632 0 0 2720 4070 2484 5114 13 1 72=
15 0
1 0 22092 3008312 189668 930336 0 0 2332 1534 1932 3825 13 1 73=
14 0
1 18 22092 2979292 189724 945780 0 0 1486 13974 2460 8446 16 2 72=
10 0
1 2 22092 2965244 189736 959228 0 0 2570 9094 3290 7204 13 1 67=
19 0
1 2 22092 2949064 189748 973100 0 0 2820 3040 2798 6639 13 2 68=
17 0
1 6 22092 2936060 189768 987788 0 0 2894 3620 2474 5443 13 1 70=
16 0
1 1 22092 2923320 189780 999708 0 0 2377 2618 2285 4794 13 1 70=
16 0
1 0 22092 2923428 189780 999964 0 0 0 4575 1732 2317 12 1 86=
1 0
1 9 22092 2912192 189784 1005260 0 0 402 4544 1541 3889 14 1 8=
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: mailto:[email protected]>