Checks stop running randomly

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
User avatar
KevinD
Posts: 26
Joined: Thu Mar 29, 2012 10:26 am

Re: Checks stop running randomly

Post by KevinD »

Alas... no change...

Ran the DB upgrade (had to make some minor tweaks due to primary keys cols(varchar) being changed to text cols (which doesn't work in MySQL)) but nothing major.

The count on the frozen checks continues to rise until we trigger the script to force a check.

I would LOVE to do a remote session.
Feel free to email, PM, USPS, FedEX, smoke signals, or telepathically transmit that and we can get something arranged for 2 minutes from now if your available.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Checks stop running randomly

Post by mguthrie »

Just a follow up for other users who may read. This issue appears to be caused be using DNX with the Nagios Core 3.4.1 engine. Not sure of the exact cause yet. More detail will be posted as we find out.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Checks stop running randomly

Post by mguthrie »

Hey what version of the Nagios Core was running on the DNX slave machine(s)?
User avatar
gwakem
Posts: 238
Joined: Mon Jan 23, 2012 2:02 pm
Location: Asheville, NC

Re: Checks stop running randomly

Post by gwakem »

Those were upgraded from r2.3 to r3.1 also, so Nagios Core 3.4.1. They have no configuration, so their databases are local and we didn't need to modify to the upgrade scripts.
--
Griffin Wakem
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Checks stop running randomly

Post by mguthrie »

Did my best to try and hunt this one down, but tracing it through the event timing loop is pretty brutal. I actually passed the question onto the nagios-devel mailing list on sourceforge to see if they have any ideas.
User avatar
gwakem
Posts: 238
Joined: Mon Jan 23, 2012 2:02 pm
Location: Asheville, NC

Re: Checks stop running randomly

Post by gwakem »

We appreciate your work and help on tracking it down. I'm going to start investigating mod gearman on Monday, since it seems to be highly recommended. Any pointers or docs that have proven helpful for gottchas would be appreciated, but I'll be scouring the page as well as the documents section next week. Have a great weekend guys.
--
Griffin Wakem
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Checks stop running randomly

Post by scottwilkerson »

Mod-gearman has the required items in RPM form now at http://mod-gearman.org/download/ so the install looks something like this

Code: Select all

cd /tmp
wget http://mod-gearman.org/download/v1.3.0/rhel6/x86_64/gearmand-0.25-1.rhel6.x86_64.rpm
wget http://mod-gearman.org/download/v1.3.0/rhel6/x86_64/gearmand-devel-0.25-1.rhel6.x86_64.rpm
wget http://mod-gearman.org/download/v1.3.0/rhel6/x86_64/mod_gearman-1.3.0-1.e.rhel6.x86_64.rpm
yum --nogpgcheck -y install gearmand-0.25-1.rhel6.x86_64.rpm
yum --nogpgcheck -y install gearmand-devel-0.25-1.rhel6.x86_64.rpm
yum --nogpgcheck -y install mod_gearman-1.3.0-1.e.rhel6.x86_64.rpm
you will need to add the NEB to the nagios.cfg, something like this

Code: Select all

broker_module=/usr/local/share/nagios/mod_gearman.o keyfile=/usr/local/share/nagios/secret.txt server=localhost eventhandler=yes hosts=yes services=yes

On each of the workers we need to adjust the config at /etc/mod_gearman/mod_gearman_worker.conf
workers we need to adjust the following values
server address is that of your gearmand server (usually Nagios Server)
server=localhost:4730
Shared key or keyfile needs to be the same on both the workers and gearmand server
key=should_be_changed
or
keyfile=
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Checks stop running randomly

Post by mguthrie »

We were finally able to located and correct the check issue with DNX. The fix will be available in the upcoming 3.3 release, but if you'd like a patch sooner than that let me know and I'll get that to you. (Not sure if you moved to Mod Gearman or not).
User avatar
gwakem
Posts: 238
Joined: Mon Jan 23, 2012 2:02 pm
Location: Asheville, NC

Re: Checks stop running randomly

Post by gwakem »

We did move to mod_gearman, and it is actually filling our needs a bit better. Thank you for letting us know though!
--
Griffin Wakem
Locked