Yessir. I'm not discounting the possibility that something is breaking within mod_gearman, but the configs are almost (save for gearmand server settings) the same.
Ah, thank you swilkerson, I think I found a clue. Can you turn of distributing event handlers with your gearman config. If you're using XI's notification handler, it won't be able to connect to the locale database and submit any notifications.
We disabled the event handling portion of gearman entirely by removing it from the both the mod_gearman_neb.conf and nagios.cfg, restarted (per below), and within ten minutes noticed the same issues with processes hanging.
Could this issue be caused by certain check plugins timing out? Are you having multiple *parent* processes spawn, or just forks of the Nagios process. Nagios forks itself to run checks, so for longer running checks you'll see many child instances of it running.
Aha! Yes indeed, we do have a lot of WMI plugins timing out due to multiple remote side rules. I was in the process of attempting to clear those up, and this gives me additional ammo to do so. Thanks, I will see if getting that cleared out helps and let you know.
This was a triumph. I'm making a note here: HUGE SUCCESS. It's hard to overstate my satisfaction.
That did it! Thanks for the help guys. We still have doubling in the /var/log/messages logfile, but that's not critical. I can open a separate post for that later. This can be closed. Thanks again!