
This upcoming weekend's data from the rebuild node (w/o any gluster/HA) may be very interesting.


I have a few different core systems running 3.5.1 and I have not noticed this behavior. The checks/5min are low on those servers though (around a 1000). This behavior in the past has been caused by:dnelson wrote: Does anybody know of anybody that is running Nagios 3.x w/ RHEL/OEL 6 with a daemon uptime greater than 33 hours that can report on service/host latencies?
Hi abrist,abrist wrote: I have a few different core systems running 3.5.1 and I have not noticed this behavior. The checks/5min are low on those servers though (around a 1000). This behavior in the past has been caused by:
1) Latency/lack of resources (ram/disk io/load)
2) System ulimits (open files is usually the culprit here)
3) Improper configuration (checks running at too small of an interval or too large of timeouts)
4) Specific checks that are load/disk intensive (vmware, oracle, sql queries, etc)
Code: Select all
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 10485760 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 63844 63844 processes
Max open files 1024 1024 files
Max locked memory 32768 32768 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 63844 63844 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Code: Select all
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 10485760 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 1024 127056 processes
Max open files 1024 4096 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 127056 127056 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us

What is going on indeed. You may have enough information for the core devs to take a look - open a ticket at http://tracker,nagios.orgdnelson wrote:What's going on when large_installation_tweaks=1 and child_processes_fork_twice=1?