Could be right, but the fact that mod_gearman itself has stop the 25+/25+/25+ load spikes is a pretty good workaround until the main load spike problem is resolved. perl scripts as well as others add more processing and time to wait for and on the cpu. I've just removed all my checks from running from nagios native queue to mod_gearman (expect localhost; _WORKER=local)
I also re-read this thread in full and changed the nom_checkpoint_interval parameter from 1440 to 90 as instructed.
I've lowered httpd.conf defaults (RH6.5) (1.5G at most available, with 20 childs)
I've enabled memcached (128mb cache with 5 sec retention)
I've ran full repair on both nagios and nagiosql databases. (online and once offline when implemented mod_gearman over weekend's change window)
I've cleaned up all configuration files including adding a dumby contact that emails no one on certain info checks so that when nagios restarts, produces no errors or warnings of any type. no dups either. clean output
We run 664 hosts checks, 3349 service checks. using just about any and ever script available including some that I have written or modified to fit my environment. I have WAN packet dropping (another issue the network team is and has been working on with IPS for months now), in which I've written wrapper scripts to coup with the snmp no data returns... there is a lot going on this single system and this upgrade to spikes was not in my planning or I would have waited as some would now today (if knowing)
We also have mk_livestatus broker_module for nagvis and check_iftraff.pl (modified) for bandwidth measurements and now mod_gearman for nagios4 (with yet a bug too "\n")
the main system resides in a HA vsphere/vcenter 5.5 environment, with 2vcpu (although never used more than 1) and 4G ram... vmstat reports 3.3G in use... I pushed the 400mb of swap to memory too over weekend and seen some IO improvements (swappoff -a && swappon -a)
One other item to point out... when I upgraded... I upgrade the RH packages and kernel first... and rebooted, then upgraded XI... system has remained online since and in the past has been flawless without a second reboot
kernel = 2.6.32-431.17.1.el6.x86_64 #1 SMP
and of course errata for this kernel
https://rhn.redhat.com/errata/RHSA-2014-0475.html
Not sure anymore... just need a nagiosXI system that can run at a level pace and mod_gearman has at least reduced the really high spikes.
ADDED:
the real kicker is the fact I run a personal Free XI on my home hypervisor for my development and some external viewing checks of clients sites I also work for and run XI internal for them... nothing spikes there on CentOS6.5. its 1vcpu, 2G ram and has 7hosts and 350 service checks.
kernel 2.6.32-431.11.2.el6.x86_64 #1 SMP
anyone out there run 2014 on CENTOS that has this type of environment setup and no load spikes? or does this spike headache relate to everyone on 2014 (RH or CentOS)?