Page 1 of 3
Mod Gearman Upgrade screws up nagios monitoring
Posted: Mon Feb 29, 2016 11:30 pm
by rajasegar
Please help, the production is down.
Seems ndomod.o is causing the problem
Code: Select all
[1456806210] Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
[1456806210] Error: Module loading failed. Aborting.
Update: caused by gearman upgrade
Re: Nagios died after upgrade from 5.2.3 to 5.2.5
Posted: Mon Feb 29, 2016 11:38 pm
by rajasegar
Last few line of nagios.debug
Code: Select all
[1456807053.953708] [064.1] [pid=3532] Making callbacks (type 2)...
[1456807053.953715] [064.2] [pid=3532] Callback #1 (type 2) return code = 0
[1456807053.953721] [064.1] [pid=3532] Making callbacks (type 2)...
[1456807053.953727] [064.2] [pid=3532] Callback #1 (type 2) return code = 0
[1456807053.953733] [064.1] [pid=3532] Making callbacks (type 2)...
[1456807053.953740] [064.2] [pid=3532] Callback #1 (type 2) return code = 0
[1456807053.953746] [064.1] [pid=3532] Making callbacks (type 2)...
[1456807053.953752] [064.2] [pid=3532] Callback #1 (type 2) return code = 0
[1456807053.953759] [064.1] [pid=3532] Making callbacks (type 2)...
[1456807053.953766] [064.2] [pid=3532] Callback #1 (type 2) return code = 0
[1456807053.953772] [064.0] [pid=3532] Module '/usr/local/nagios/bin/ndomod.o' loaded with return code of '0'
[1456807053.953775] [064.0] [pid=3532] nebmodule_deinit() found
[1456807053.953781] [064.1] [pid=3532] Making callbacks (type 2)...
[1456807053.953788] [064.2] [pid=3532] Callback #1 (type 2) return code = 0
Re: Nagios died after upgrade from 5.2.3 to 5.2.5
Posted: Mon Feb 29, 2016 11:55 pm
by rajasegar
Problem was due to instructions to update gearmand to fix the resource leak problem.
Code: Select all
cd /tmp
yum remove libgearman-devel libgearman gearmand mod_gearman
mkdir gearman_install
cd gearman_install/
wget http://mod-gearman.org/download/v2.1.1/rhel6/x86_64/gearmand-0.33-2.rhel6.x86_64.rpm
wget http://mod-gearman.org/download/v2.1.1/rhel6/x86_64/gearmand-devel-0.33-2.rhel6.x86_64.rpm
wget http://mod-gearman.org/download/v2.1.1/rhel6/x86_64/gearmand-server-0.33-2.rhel6.x86_64.rpm
wget http://mod-gearman.org/download/v2.1.1/rhel6/x86_64/mod_gearman2-2.1.1-1.rhel6.x86_64.rpm
yum --nogpgcheck localinstall *
sed -i 's/\(^broker_module=.*mod_gearman.*\)/#\1/' /usr/local/nagios/etc/nagios.cfg
echo "broker_module=/usr/lib64/mod_gearman2/mod_gearman2.o config=/etc/mod_gearman/mod_gearman_neb.conf eventhandler=no" >> /usr/local/nagios/etc/nagios.cfg
service nagios stop
service mod_gearman_worker stop
service gearmand stop
service gearmand start
service mod_gearman_worker start
service nagios start
After the upgrade the workers are now run as naemon user and most of the monitoring stopped working due to permission issues.
There is no such thing as mod_gearman_worker service anymore, it becomes mod-gearman2-worker.
The location of configuration files is also not in /etc/mod_gearman but /etc/mod_gearman2
The configuration filenames is now called module.conf and worker.conf
No more mod_gearman_neb.conf and mod_gearman_worker.conf
Appreciate if Nagios could update the instructions above to avoid similar issue I was facing.
Thanks
Re: Mod Gearman Upgrade screws up nagios monitoring
Posted: Tue Mar 01, 2016 3:26 pm
by lmiltchev
Problem was due to instructions to update gearmand to fix the resource leak problem.
Can you provide us with a URL link to these instructions? Thank you!
Re: Mod Gearman Upgrade screws up nagios monitoring
Posted: Tue Mar 01, 2016 7:53 pm
by rajasegar
lmiltchev wrote:Problem was due to instructions to update gearmand to fix the resource leak problem.
Can you provide us with a URL link to these instructions? Thank you!
Here you go
https://support.nagios.com/forum/viewto ... 20#p167366
Re: Mod Gearman Upgrade screws up nagios monitoring
Posted: Wed Mar 02, 2016 12:32 pm
by tmcdonald
Those instructions were from Jan 11, and the same developer posted updated instructions on Jan 28:
https://support.nagios.com/forum/viewto ... 30#p169710
Did you see and/or follow those instructions? I don't believe they have changed since then.
Re: Mod Gearman Upgrade screws up nagios monitoring
Posted: Wed Mar 02, 2016 6:02 pm
by Box293
FYI all of this has been incorporated into an automated script and is part of our official documentation:
https://support.nagios.com/kb/article.php?id=225
Re: Mod Gearman Upgrade screws up nagios monitoring
Posted: Wed Mar 02, 2016 9:52 pm
by rajasegar
Does the script take care of the user credential that the mod gearman worker is running as?
It supposed to be nagios, mine ended up being naemon.
Anyway will try this updated script the next time I rebuild any instance.
You can close the case for now.
Re: Mod Gearman Upgrade screws up nagios monitoring
Posted: Wed Mar 02, 2016 10:02 pm
by Box293
I will follow this up with the dev's, I think you've identified something here.
Re: Mod Gearman Upgrade screws up nagios monitoring
Posted: Thu Mar 03, 2016 3:32 pm
by bheden
Does the script take care of the user credential that the mod gearman worker is running as?
It supposed to be nagios, mine ended up being naemon.
It does now!
The ModGearmanInstall.sh script (
https://assets.nagios.com/downloads/nag ... Install.sh) has been updated to reflect the proper user, and gives you a user flag to override the default if necessary.
Other than that, the instructions located at
https://assets.nagios.com/downloads/nag ... ios_XI.pdf are mostly relevant. I also changed the installation type flag to --type=worker or --type=server
We'll be updating the documentation shortly to reflect those changes.
Let us know how it goes. Thanks.