Re: [Nagios-devel] Nagios and Gearman - huge environment

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] Nagios and Gearman - huge environment

Post by Guest »

--0016367fb2a54e233e04aae2a05d
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

Thanks, Daniel, but I don=B4t think that my problem is of hardware. I creat=
e
the ramdisk and the problem is the same:
- nagios eating 100% of CPU all the time;
- nagios does not distribute the active checks in a smoothly way. It waits
a long time and make the acitve checks in a burst way. I can see this with
the gearman_top. The gearmand jobs waiting queue is empty almost all the
time, but sometimes there is a burst of jobs in the queue. I can=B4t
understand this behavior.

Any help would be great. Thanks everybody.

=3D=3D=3D=3D=3D=3D=3D=3D=3D
Top result
=3D=3D=3D=3D=3D=3D=3D=3D=3D

top - 18:40:59 up 106 days, 16:56, 4 users, load average: 8.52, 6.09, 5.4=
2
Tasks: 215 total, 2 running, 213 sleeping, 0 stopped, 0 zombie
Cpu(s): 12.5%us, 0.1%sy, 0.0%ni, 87.1%id, 0.3%wa, 0.0%hi, 0.0%si,
0.0%st
Mem: 4916356k total, 1974976k used, 2941380k free, 163240k buffers
Swap: 4194296k total, 22092k used, 4172204k free, 745100k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2189 nagios 25 0 492m 255m 1668 R 100.1 5.3 66:54.59 nagios
24658 nagios 15 0 561m 116m 676 S 0.7 2.4 62:00.96 gearmand




On Fri, Aug 19, 2011 at 1:31 PM, Daniel Wittenberg wrote:

> Well but look at your bi and bo, and then the wa column. So looks like
> you have some IO Wait which probably means it=92s waiting on disk activit=
y to
> get things done, and lots of writing to disk. Have you looked at adding =
a
> ramdisk for your checkresults, status.dat, and temp_file? That should he=
lp
> eliminate most of the heavy disk i/o from the nagios perspective. Since =
it
> doesn=92t look like you are swapping memory you should be able to throw s=
ome
> at a ramdisk. You can probably start with 64MB and watch it, might have =
to
> go higher depending on your workload.****
>
> ** **
>
> Dan****
>
> ** **
>
> *From:* Rodney Ramos [mailto:[email protected]]
> *Sent:* Friday, August 19, 2011 11:27 AM
> *To:* Nagios Developers List
> *Subject:* Re: [Nagios-devel] Nagios and Gearman - huge environment
> performance problem****
>
> ** **
>
> Hi, Daniel,
>
> As we can see below, I think it is not a hardware problem. The idle CPU i=
s
> beteween 60 and 80 %, very good.
>
> Thank you very much.
>
>
> $ vmstat 5
> procs -----------memory---------- ---swap-- -----io---- --system--
> -----cpu------
> r b swpd free buff cache si so bi bo in cs us sy i=
d
> wa st
> 1 2 22092 3046788 189640 890940 0 0 295 1053 0 0 4 3 =
83
> 10 0
> 1 2 22092 3032992 189664 904600 0 0 2733 7550 3498 7477 12 1 =
69
> 18 0
> 1 2 22092 3018240 189668 918632 0 0 2720 4070 2484 5114 13 1 =
72
> 15 0
> 1 0 22092 3008312 189668 930336 0 0 2332 1534 1932 3825 13 1 =
73
> 14 0
> 1 18 22092 2979292 189724 945780 0 0 1486 13974 2460 8446 16 2 =
72
> 10 0
> 1 2 22092 2965244 189736 959228 0 0 2570 9094 3290 7204 13 1 =
67
> 19 0
> 1 2 22092 2949064 189748 973100 0 0 2820 3040 2798 6639 13 2 =
68
> 17 0
> 1 6 22092 2936060 189768 987788 0 0 2894 3620 2474 5443 13 1 =
70
> 16 0
> 1 1 22092 2923320 189780 999708 0 0 2377 2618 2285 4794 13 1 =
70
> 16 0
> 1 0 22092 2923428 189780 999964 0 0 0 4575 1732 2317 12 1
> 86 1 0
> 1 9 22092 2912192 189784 1005260 0 0 402 4544 1541 3889 14 1
> 82 3 0
> 1 7 22092 2891692 189808 1023020 0 0 2534 13969 3232 9421 14 2
> 66 17 0
> 3 2 22092 2868908 189836 1037064 0 0 2797 4115 3002 7055 30 2
> 54 14 0
> 2 2 22092 2860712 189860 1050376 0 0 2646 3352 2448 5416 16 1
> 67 17 0
> 1 8 22092 2847052 189872 1064036 0 0 2748 3970 2616 5487 13 1
> 69 17 0
> 1 0 22092 3469576 189876 462624 0 0 825 1245 1379 2098 12 1
> 83 5 0
> 1 0 22092 3469248 189884 462720 0 0 4 2631 1552 2599 13 0
> 86 0 0
> 1 2

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked