mod-gearman issues

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
fkroeger
Posts: 38
Joined: Mon Jan 18, 2010 10:45 pm
Location: Perth, Western Australia

mod-gearman issues

Post by fkroeger »

I've just installed mod-gearman as per the Nagios doc - "Integrating Mod-Gearman With Nagios XI"
I have downloaded & installed the rpm for For 64-bit CentOS/RHEL 6 Nagios servers - NagiosXI 2014 - no problems with the yum install.

The gearmand service starts but then exits.

Code: Select all

# service gearmand status
gearmand dead but pid file exists
The log file only shows messages for the worker

Code: Select all

Jul 23 15:13:34 bdcvngs006 mod_gearman_worker: *** glibc detected *** /usr/bin/mod_gearman_worker: double free or corruption (!prev): 0x0000000000f64770 ***
Jul 23 15:13:34 bdcvngs006 mod_gearman_worker: *** glibc detected *** /usr/bin/mod_gearman_worker: double free or corruption (!prev): 0x0000000000f64770 ***
Jul 23 15:13:34 bdcvngs006 mod_gearman_worker: *** glibc detected *** /usr/bin/mod_gearman_worker: double free or corruption (!prev): 0x0000000000f64770 ***
Jul 23 15:13:34 bdcvngs006 mod_gearman_worker: *** glibc detected *** /usr/bin/mod_gearman_worker: double free or corruption (!prev): 0x0000000000f64770 ***
Jul 23 15:13:34 bdcvngs006 mod_gearman_worker: *** glibc detected *** /usr/bin/mod_gearman_worker: double free or corruption (!prev): 0x0000000000f64770 ***
Jul 23 15:13:34 bdcvngs006 kernel: Pid 9490(mod_gearman_wor) over core_pipe_limit
Jul 23 15:13:34 bdcvngs006 kernel: Skipping core dump
Jul 23 15:13:34 bdcvngs006 abrt[10649]: Not dumping repeating crash in '/usr/bin/mod_gearman_worker'
Jul 23 15:13:34 bdcvngs006 kernel: abrt-hook-ccpp[10648]: segfault at 60 ip 0000003300d3357f sp 00007fff89602ea8 error 4 in libc-2.12.so[3300c00000+18a000]
Jul 23 15:13:34 bdcvngs006 kernel: Process 10648(abrt-hook-ccpp) has RLIMIT_CORE set to 1
Jul 23 15:13:34 bdcvngs006 kernel: Aborting core
gearman_top shows

Code: Select all

15-07-23 15:15:01  -  localhost:4730

 failed to connect to localhost:4730 - Connection refused
ps shows the worker running even though we have errors listed in messages - but no mod-gearmand

Code: Select all

# ps -ef | grep gear
nagios    1134     1  0 14:57 ?        00:00:00 /usr/bin/mod_gearman_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/var/mod_gearman/mod_gearman_worker.pid
nagios   12618  1134  0 15:17 ?        00:00:00 /usr/bin/mod_gearman_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/var/mod_gearman/mod_gearman_worker.pid
nagios   12619  1134  0 15:17 ?        00:00:00 /usr/bin/mod_gearman_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/var/mod_gearman/mod_gearman_worker.pid
nagios   12620  1134  0 15:17 ?        00:00:00 /usr/bin/mod_gearman_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/var/mod_gearman/mod_gearman_worker.pid
nagios   12621  1134  0 15:17 ?        00:00:00 /usr/bin/mod_gearman_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/var/mod_gearman/mod_gearman_worker.pid
nagios   12622  1134  0 15:17 ?        00:00:00 /usr/bin/mod_gearman_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/var/mod_gearman/mod_gearman_worker.pid
nagios   12870  1134  0 15:17 ?        00:00:00 /usr/bin/mod_gearman_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/var/mod_gearman/mod_gearman_worker.pid

I am running NagiosXI 2014R2.7 on a Nagios supplied VM - CentOS 6.3

There really is nothing else in the doco to explain what to do/check if there is a problem.
Can you advise what I need to check next ?

regards.... Fred
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: mod-gearman issues

Post by jdalrymple »

Try this:

Code: Select all

rm /var/run/gearmand/gearmand.pid
service gearmand start
fkroeger
Posts: 38
Joined: Mon Jan 18, 2010 10:45 pm
Location: Perth, Western Australia

Re: mod-gearman issues

Post by fkroeger »

Did as suggested.
New pid file is created (and updated) when service is started - but gearmand still fails to start and the pid file is empty
here are the versions that were installed:

Code: Select all

# rpm -qa | grep gearman
gearmand-1.1.8-2.el6.x86_64
libgearman-1.1.8-2.el6.x86_64
mod_gearman-1.5.0b1-1.el6.x86_64
fkroeger
Posts: 38
Joined: Mon Jan 18, 2010 10:45 pm
Location: Perth, Western Australia

Re: mod-gearman issues

Post by fkroeger »

According to my good friend "Google" it appears that if gearmand can't bind to an IPV6 socket it won't start.
https://bugs.launchpad.net/gearmand/+bug/1134534

The workaround is to add the -L <X.X.X.X> option to force it to bind to IPV4
I've done this and gearmand service starts OK now and gearman_top displays OK - only if I pass the IP address - eg: gearman_top -H X.X.X.X

This was a problem with gearmand 1.0.3 - gearmand 1.1.8 is installed and it's still a problem?

Considering I am using a standard Nagios supplied VM , why is there no mention of this in the Integrating Mod-Gearman With Nagios XI document.

BTW - I also just found out the hard way that you alos need to replace 'localhost' with the server IP address in the mod_gearman_neb.conf & /mod_gearman_worker.conf files

regards... Fred
User avatar
eloyd
Cool Title Here
Posts: 2173
Joined: Thu Sep 27, 2012 9:14 am
Location: Rochester, NY
Contact:

Re: mod-gearman issues

Post by eloyd »

mod_gearman is not a Nagios supported product and is maintained by a third-party. It would be unreasonable to expect that Nagios Enterprises staff can maintain every document that says "how to integrate Nagios with XXX" on a continual basis for third-party maintained products.

I believe that your system is actually at fault, not mod_gearman, in that you have IPv6 enabled on your server, therefore mod_gearman thinks that it should be using it. I cannot tell for sure without access, but that is typical of IPv6 enabled software.
Image
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoyd
I'm a Nagios Fanatic! • Join our public Nagios Discord Server!
fkroeger
Posts: 38
Joined: Mon Jan 18, 2010 10:45 pm
Location: Perth, Western Australia

Re: mod-gearman issues

Post by fkroeger »

Not sure I agree with you.
I understand that is a 3rd part product, however It is a Nagios Document and the rpm is the one available from the Nagios Site.
The rpm is quite old compared to the latest stable version from the 3rd party site - however I installed the version recommended in the Nagios document.

"My system" is the standard Nagios supplied VM and IPV6 is not enabled. Read the link I supplied which states that it fails if it can't bind to IPV6.
User avatar
eloyd
Cool Title Here
Posts: 2173
Joined: Thu Sep 27, 2012 9:14 am
Location: Rochester, NY
Contact:

Re: mod-gearman issues

Post by eloyd »

I did read your link. My understanding is that mod_gearman only tries to bind if your system is IPv6 enabled. But I suppose that's neither here nor there if you have a working system.

My point was more that I do think that you cannot fault Nagios for not having a 100% up-to-date, completely accurate document for how to integrate a new version of Nagios with an old version of third-party software. If anything, I would expect that the mod_gearman people would provide an updated document on how to integrate with Nagios, not the other way around.
Image
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoyd
I'm a Nagios Fanatic! • Join our public Nagios Discord Server!
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: mod-gearman issues

Post by jdalrymple »

fkroeger,

You're right in that if you encountered these problems trying to integrate with our vanilla VM and had issues that our documentation is lacking.

I will create a documentation task for myself to get these problems you encountered and the resolutions noted. Thank you very much for sharing your findings and we'll be sure to get our documentations updated accordingly. It's very important to us that yours and everyone else's installation works smoothly. We apologize that this became an issue in the first place.

OK to lock the topic?
fkroeger
Posts: 38
Joined: Mon Jan 18, 2010 10:45 pm
Location: Perth, Western Australia

Re: mod-gearman issues

Post by fkroeger »

Yes you can close and no need to apologise as I understand that environments change and it's almost impossible to keep all those "balls in the air".
And to be fair, the VM I'm running is over a year old now, so the latest NagiosXI VM may not even have a problem compared to the one I'm currently running.
I've been digging a bit deeper and I've discovered some differences in the VMs that I downloaded a few months apart. They're both CentOS 6.3 but they had different sysctl.conf files.
One of them had an entry for fs.file-max=4097 which I only discovered when I started getting messages in the log file saying that I had reached the limit of open files.
I know we have never changed this file, so I can only presume that this is how it was originally configured.

Regards... Fred
Locked