Alas... no change...
Ran the DB upgrade (had to make some minor tweaks due to primary keys cols(varchar) being changed to text cols (which doesn't work in MySQL)) but nothing major.
The count on the frozen checks continues to rise until we trigger the script to force a check.
I would LOVE to do a remote session.
Feel free to email, PM, USPS, FedEX, smoke signals, or telepathically transmit that and we can get something arranged for 2 minutes from now if your available.
Checks stop running randomly
Re: Checks stop running randomly
Just a follow up for other users who may read. This issue appears to be caused be using DNX with the Nagios Core 3.4.1 engine. Not sure of the exact cause yet. More detail will be posted as we find out.
Re: Checks stop running randomly
Hey what version of the Nagios Core was running on the DNX slave machine(s)?
Re: Checks stop running randomly
Those were upgraded from r2.3 to r3.1 also, so Nagios Core 3.4.1. They have no configuration, so their databases are local and we didn't need to modify to the upgrade scripts.
--
Griffin Wakem
Griffin Wakem
Re: Checks stop running randomly
Did my best to try and hunt this one down, but tracing it through the event timing loop is pretty brutal. I actually passed the question onto the nagios-devel mailing list on sourceforge to see if they have any ideas.
Re: Checks stop running randomly
We appreciate your work and help on tracking it down. I'm going to start investigating mod gearman on Monday, since it seems to be highly recommended. Any pointers or docs that have proven helpful for gottchas would be appreciated, but I'll be scouring the page as well as the documents section next week. Have a great weekend guys.
--
Griffin Wakem
Griffin Wakem
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Checks stop running randomly
Mod-gearman has the required items in RPM form now at http://mod-gearman.org/download/ so the install looks something like this
you will need to add the NEB to the nagios.cfg, something like this
On each of the workers we need to adjust the config at /etc/mod_gearman/mod_gearman_worker.conf
workers we need to adjust the following values
server address is that of your gearmand server (usually Nagios Server)
server=localhost:4730
Shared key or keyfile needs to be the same on both the workers and gearmand server
key=should_be_changed
or
keyfile=
Code: Select all
cd /tmp
wget http://mod-gearman.org/download/v1.3.0/rhel6/x86_64/gearmand-0.25-1.rhel6.x86_64.rpm
wget http://mod-gearman.org/download/v1.3.0/rhel6/x86_64/gearmand-devel-0.25-1.rhel6.x86_64.rpm
wget http://mod-gearman.org/download/v1.3.0/rhel6/x86_64/mod_gearman-1.3.0-1.e.rhel6.x86_64.rpm
yum --nogpgcheck -y install gearmand-0.25-1.rhel6.x86_64.rpm
yum --nogpgcheck -y install gearmand-devel-0.25-1.rhel6.x86_64.rpm
yum --nogpgcheck -y install mod_gearman-1.3.0-1.e.rhel6.x86_64.rpmCode: Select all
broker_module=/usr/local/share/nagios/mod_gearman.o keyfile=/usr/local/share/nagios/secret.txt server=localhost eventhandler=yes hosts=yes services=yesOn each of the workers we need to adjust the config at /etc/mod_gearman/mod_gearman_worker.conf
workers we need to adjust the following values
server address is that of your gearmand server (usually Nagios Server)
server=localhost:4730
Shared key or keyfile needs to be the same on both the workers and gearmand server
key=should_be_changed
or
keyfile=
Re: Checks stop running randomly
We were finally able to located and correct the check issue with DNX. The fix will be available in the upcoming 3.3 release, but if you'd like a patch sooner than that let me know and I'll get that to you. (Not sure if you moved to Mod Gearman or not).
Re: Checks stop running randomly
We did move to mod_gearman, and it is actually filling our needs a bit better. Thank you for letting us know though!
--
Griffin Wakem
Griffin Wakem