Is it a requirement to run this upgrade script after every Nagios XI update or just updates that contain newer versions of Core? We last upgraded gearman per those instructions after updating to 5.5.9 which included Nagios Core 4.4.3.ssax wrote:If you are running Core 4.4.3 (which you are), you are REQUIRED to upgrade gearman server on XI server and gearman workers.
https://assets.nagios.com/downloads/nag ... ios_XI.pdf
Hosts and services temporarily unavailable
Re: Hosts and services temporarily unavailable
Re: Hosts and services temporarily unavailable
Core 4.4.3 was incompatible with Gearman and required downgrade Core to 4.4.2 to work if you were running gearman. Now that you're on 4.4.3 AND you run gearman, you NEED to upgrade gearman (not usually the case, it was an incompatibility and was resolved in the latest gearman, hence the upgrade being required).
Let me know if you have any questions or if I can clarify anything.
Let me know if you have any questions or if I can clarify anything.
Re: Hosts and services temporarily unavailable
The changelog within *ModGearmanInstall.sh* hasn't been updated since after we ran the update last. Although I did attempt to upgrade today during a maintenance window with I received the following error:
This is a bit confusing since I'm running an upgrade (not an installation). Do these packages really have to be removed first?
Code: Select all
./ModGearmanInstall.sh --type=server --upgradeCode: Select all
*******************************************************************************
ERROR
Package gearmand was detected prior to first-time installation. Did you remove your Mod Gearman 2 packages?
Please remove the following packages: gearmand gearmand-devel mod_gearman mod_gearman-debuginfo gearmand-server
*******************************************************************************
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Hosts and services temporarily unavailable
Hi,
Did you follow the upgrade instructions ( see: Server Installation – Upgrade (2 => 3) ). You'll need to remove Mod Gearman 2 and then install Mod Gearman 3.
Remove Gearman 2
Install Gearman 3
See: Integrating Mod-Gearman With Nagios XI
Did you follow the upgrade instructions ( see: Server Installation – Upgrade (2 => 3) ). You'll need to remove Mod Gearman 2 and then install Mod Gearman 3.
Remove Gearman 2
Code: Select all
# Remove Mod Gearman 2
cp /etc/mod_gearman2/* /tmp/
yum remove gearmand gearmand-server gearmand-debuginfo gearmand-devel mod_gearman2 -y
sed -i 's/^broker\(.*\)gearman2\(.*\)/#broker\1gearman2\2/' /usr/local/nagios/etc/nagios.cfg
Code: Select all
# Download and install Mod Gearman 3
cd /tmp
wget https://assets.nagios.com/downloads/nagiosxi/scripts/ModGearmanInstall.sh
chmod +x ModGearmanInstall.sh
./ModGearmanInstall.sh --type=server
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Hosts and services temporarily unavailable
We upgraded to Gearman 3 some time ago. There are no installations of (or configurations that reference) version 2 on this server.
-
bheden
- Product Development Manager
- Posts: 179
- Joined: Thu Feb 13, 2014 9:50 am
- Location: Nagios Enterprises
Re: Hosts and services temporarily unavailable
@drug I'm the original author of that script - and I'll be taking a look at it in depth. Currently, I'm reading through this post attempting to find something maybe/hopefully obvious that wasn't already pointed out.
What are your XI and db and modgearman workers system specifications? CPU count, memory size. What type of disks? Is it virtual? Which hypervisor, if so?
How many modgearman workers do you have? What do those configurations look like?
Can you enable ndo2db debugging and perhaps supply us with that output?
My apologies if any of this has been repeated to you, I'm a bit late to the game
What are your XI and db and modgearman workers system specifications? CPU count, memory size. What type of disks? Is it virtual? Which hypervisor, if so?
How many modgearman workers do you have? What do those configurations look like?
Can you enable ndo2db debugging and perhaps supply us with that output?
My apologies if any of this has been repeated to you, I'm a bit late to the game
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Nagios Enterprises
Senior Developer
Nagios Enterprises
Senior Developer
Re: Hosts and services temporarily unavailable
Is mod_gearman currently processing checks as it should?
What is the gearman_top output now? Do you see things being processed if you watch it?
If gearman is working as normal, you don't likely have a problem with gearman.
If it's only 30 seconds after an apply configuration, that's normal according to the devs:
The devs said that after apply configuration it can take up to 3 minutes after the apply config for the NDOUtils rebuild/update process to complete, at that point it starts the CCM permission building process (unrelated here but good to know) as the CCM permissions are built off of the NDOUtils permissions and those need to be ready.
Answer bhenden's stuff from his post above as well.
What is the gearman_top output now? Do you see things being processed if you watch it?
If gearman is working as normal, you don't likely have a problem with gearman.
If it's only 30 seconds after an apply configuration, that's normal according to the devs:
The devs said that after apply configuration it can take up to 3 minutes after the apply config for the NDOUtils rebuild/update process to complete, at that point it starts the CCM permission building process (unrelated here but good to know) as the CCM permissions are built off of the NDOUtils permissions and those need to be ready.
Answer bhenden's stuff from his post above as well.
Re: Hosts and services temporarily unavailable
XI: 8-core CPU, 8GB RAM (VMware VM)bheden wrote: What are your XI and db and modgearman workers system specifications? CPU count, memory size. What type of disks? Is it virtual? Which hypervisor, if so?
Workers: It varies, some use ARM and have only 512M of RAM but our primaries are VMware 4-core CPU, 4GB RAM.
We have ~20 nodes; each averaging the ability to exec ~150 workers at any given time. Each is configured to only to execute jobs for specific hostgroups.bheden wrote: How many modgearman workers do you have? What do those configurations look like?
We haven't had any issues with gearmand or our workers keeping up with checks.
Sure, I will accumulate some information and PM to you.bheden wrote: Can you enable ndo2db debugging and perhaps supply us with that output?
No worries, your help is appreciated!bheden wrote: My apologies if any of this has been repeated to you, I'm a bit late to the game![]()
Re: Hosts and services temporarily unavailable
Yesssax wrote:Is mod_gearman currently processing checks as it should?
Yes, we don't have any issues with gearmand or the mod gearman nodes. All checks are being processed in a timely fashion as they should be (no waiting and averaging ~.5 second latency for service checks).ssax wrote: What is the gearman_top output now? Do you see things being processed if you watch it?
I think this speaks directly to the problem and based on their comment it sounds like that the answer to my initial question is that this is normal and (short of an architecture/software design change) the only option for us would be increase the resources on the XI instance and associated database in order to shorten the window during which the XI interface is unavailable?ssax wrote:I
If it's only 30 seconds after an apply configuration, that's normal according to the devs:
The devs said that after apply configuration it can take up to 3 minutes after the apply config for the NDOUtils rebuild/update process to complete, at that point it starts the CCM permission building process (unrelated here but good to know) as the CCM permissions are built off of the NDOUtils permissions and those need to be ready.
Re: Hosts and services temporarily unavailable
That is correct, I only know this because I had the very discussion with the head of development when working a bug in the CCMs permissions building and until NDOUtils is removed from the architecture, you can count on this being the case:this is normal and (short of an architecture/software design change) the only option for us would be increase the resources on the XI instance and associated database in order to shorten the window during which the XI interface is unavailable?
Expect up to a 3 minute delay after an Apply Configuration (at the minimum)
depending on how many objects/records in the DB/total checks/active vs inactive/load/DB offloaded/ etc, they all have an impact