Mod Gearman Upgrade screws up nagios monitoring

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Mod Gearman Upgrade screws up nagios monitoring

Post by rajasegar »

Please help, the production is down.
Seems ndomod.o is causing the problem

Code: Select all

[1456806210] Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
[1456806210] Error: Module loading failed. Aborting.
Update: caused by gearman upgrade
Last edited by rajasegar on Tue Mar 01, 2016 1:33 am, edited 2 times in total.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Nagios died after upgrade from 5.2.3 to 5.2.5

Post by rajasegar »

Last few line of nagios.debug

Code: Select all

[1456807053.953708] [064.1] [pid=3532] Making callbacks (type 2)...
[1456807053.953715] [064.2] [pid=3532] Callback #1 (type 2) return code = 0
[1456807053.953721] [064.1] [pid=3532] Making callbacks (type 2)...
[1456807053.953727] [064.2] [pid=3532] Callback #1 (type 2) return code = 0
[1456807053.953733] [064.1] [pid=3532] Making callbacks (type 2)...
[1456807053.953740] [064.2] [pid=3532] Callback #1 (type 2) return code = 0
[1456807053.953746] [064.1] [pid=3532] Making callbacks (type 2)...
[1456807053.953752] [064.2] [pid=3532] Callback #1 (type 2) return code = 0
[1456807053.953759] [064.1] [pid=3532] Making callbacks (type 2)...
[1456807053.953766] [064.2] [pid=3532] Callback #1 (type 2) return code = 0
[1456807053.953772] [064.0] [pid=3532] Module '/usr/local/nagios/bin/ndomod.o' loaded with return code of '0'
[1456807053.953775] [064.0] [pid=3532] nebmodule_deinit() found
[1456807053.953781] [064.1] [pid=3532] Making callbacks (type 2)...
[1456807053.953788] [064.2] [pid=3532] Callback #1 (type 2) return code = 0

5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Nagios died after upgrade from 5.2.3 to 5.2.5

Post by rajasegar »

Problem was due to instructions to update gearmand to fix the resource leak problem.

Code: Select all

cd /tmp
yum remove libgearman-devel libgearman gearmand mod_gearman
mkdir gearman_install
cd gearman_install/
wget http://mod-gearman.org/download/v2.1.1/rhel6/x86_64/gearmand-0.33-2.rhel6.x86_64.rpm
wget http://mod-gearman.org/download/v2.1.1/rhel6/x86_64/gearmand-devel-0.33-2.rhel6.x86_64.rpm
wget http://mod-gearman.org/download/v2.1.1/rhel6/x86_64/gearmand-server-0.33-2.rhel6.x86_64.rpm
wget http://mod-gearman.org/download/v2.1.1/rhel6/x86_64/mod_gearman2-2.1.1-1.rhel6.x86_64.rpm
yum --nogpgcheck localinstall *
sed -i 's/\(^broker_module=.*mod_gearman.*\)/#\1/' /usr/local/nagios/etc/nagios.cfg
echo "broker_module=/usr/lib64/mod_gearman2/mod_gearman2.o config=/etc/mod_gearman/mod_gearman_neb.conf eventhandler=no" >> /usr/local/nagios/etc/nagios.cfg
service nagios stop
service mod_gearman_worker stop
service gearmand stop
service gearmand start
service mod_gearman_worker start
service nagios start
After the upgrade the workers are now run as naemon user and most of the monitoring stopped working due to permission issues.

There is no such thing as mod_gearman_worker service anymore, it becomes mod-gearman2-worker.

The location of configuration files is also not in /etc/mod_gearman but /etc/mod_gearman2

The configuration filenames is now called module.conf and worker.conf
No more mod_gearman_neb.conf and mod_gearman_worker.conf

Appreciate if Nagios could update the instructions above to avoid similar issue I was facing.
Thanks
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Mod Gearman Upgrade screws up nagios monitoring

Post by lmiltchev »

Problem was due to instructions to update gearmand to fix the resource leak problem.
Can you provide us with a URL link to these instructions? Thank you!
Be sure to check out our Knowledgebase for helpful articles and solutions!
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Mod Gearman Upgrade screws up nagios monitoring

Post by rajasegar »

lmiltchev wrote:
Problem was due to instructions to update gearmand to fix the resource leak problem.
Can you provide us with a URL link to these instructions? Thank you!
Here you go

https://support.nagios.com/forum/viewto ... 20#p167366
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Mod Gearman Upgrade screws up nagios monitoring

Post by tmcdonald »

Those instructions were from Jan 11, and the same developer posted updated instructions on Jan 28:

https://support.nagios.com/forum/viewto ... 30#p169710

Did you see and/or follow those instructions? I don't believe they have changed since then.
Former Nagios employee
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Mod Gearman Upgrade screws up nagios monitoring

Post by Box293 »

FYI all of this has been incorporated into an automated script and is part of our official documentation:

https://support.nagios.com/kb/article.php?id=225
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Mod Gearman Upgrade screws up nagios monitoring

Post by rajasegar »

Box293 wrote:FYI all of this has been incorporated into an automated script and is part of our official documentation:

https://support.nagios.com/kb/article.php?id=225
Does the script take care of the user credential that the mod gearman worker is running as?
It supposed to be nagios, mine ended up being naemon.

Anyway will try this updated script the next time I rebuild any instance.

You can close the case for now.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Mod Gearman Upgrade screws up nagios monitoring

Post by Box293 »

I will follow this up with the dev's, I think you've identified something here.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
bheden
Product Development Manager
Posts: 179
Joined: Thu Feb 13, 2014 9:50 am
Location: Nagios Enterprises

Re: Mod Gearman Upgrade screws up nagios monitoring

Post by bheden »

Does the script take care of the user credential that the mod gearman worker is running as?
It supposed to be nagios, mine ended up being naemon.
It does now!

The ModGearmanInstall.sh script (https://assets.nagios.com/downloads/nag ... Install.sh) has been updated to reflect the proper user, and gives you a user flag to override the default if necessary.

Other than that, the instructions located at https://assets.nagios.com/downloads/nag ... ios_XI.pdf are mostly relevant. I also changed the installation type flag to --type=worker or --type=server

We'll be updating the documentation shortly to reflect those changes.

Let us know how it goes. Thanks.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Nagios Enterprises
Senior Developer
Locked