Re: [Nagios-devel] Nagios and Gearman - huge environment

Guest · Post by **Guest** » Sat Aug 20, 2011 11:13 am

Rodney Ramos wrote:
> Thanks, Daniel, but I don´t think that my problem is of hardware. I create
> the ramdisk and the problem is the same:
> - nagios eating 100% of CPU all the time;
[...]
> top - 18:40:59 up 106 days, 16:56, 4 users, load average: 8.52, 6.09, 5.42
> Tasks: 215 total, 2 running, 213 sleeping, 0 stopped, 0 zombie
> Cpu(s): 12.5%us, 0.1%sy, 0.0%ni, 87.1%id, 0.3%wa, 0.0%hi, 0.0%si,
> 0.0%st
> Mem: 4916356k total, 1974976k used, 2941380k free, 163240k buffers
> Swap: 4194296k total, 22092k used, 4172204k free, 745100k cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 2189 nagios 25 0 492m 255m 1668 R 100.1 5.3 66:54.59 nagios
> 24658 nagios 15 0 561m 116m 676 S 0.7 2.4 62:00.96 gearmand

1. Nagios defaults to putting the temp check results files into
.../var/spool (which is where you mounted the ramdisk, I suppose?), but
leaving the status.dat in .../var. You might want to change the
status_file setting to point to the ramdisk, too.

2. The "top" output shows a nagios process using 100% of *one* CPU/core,
and suggests that you have a total of 8 cores available. (Look into
/proc/cpuinfo for confirmation.) It seems to be the master process,
which is limited to one core like Sven pointed out. (Look at the output
of "ps auwwwx | grep the-PID-of-the-CPU-hogging-Nagios-process" for
confirmation; the master process should also be session leader etc.,
showing a process state of "Rsl"/"Ssl" or similar, instead of just "R"/"S".)

3. Use "nagiostats", which should sit right next to your "nagios"
executable, to look at Nagios' internal stats in a way you can automate
and send to a logfile. Unless stated otherwise, "x / y / z" readings are
min, max, average.

Regards,
J. Bern

This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]