Mod Gearman2 and Nagios XI 5.4.3

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
srussell23836
Posts: 4
Joined: Wed Jul 12, 2017 1:31 pm

Mod Gearman2 and Nagios XI 5.4.3

Post by srussell23836 »

Does anyone know if Mod Gearman2 is the suggested version to run with Nagios XI 5.4.3?

I have an issue that I can't seem to figure out. Here is my setup:

Nagios XI server
2 Mod Gearman2 worker nodes
All on same subnet

I would like the worker nodes to do all of the host and service checks and disable the worker node on the Nagios XI box. When I stop mod-gearman2-worker on the Nagios XI server all of my checks return: CHECK_NRPE: Error - Could not complete SSL handshake. When I restart mod-gearman2-worker on the XI server it clears up.

I have verified that all of my client machines have the IP addresses of the worker nodes in the allowed_hosts section in nrpe.cfg. Looking at the worker nodes it appears they are not very active in connecting to client's port 5666. If I watch the output of 'netstat -napc | grep 5666' I see very few Established Connections, but a ton of TIME_WAIT connections. In looking at the logs it appears the XI server is doing 99% of the checks.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Mod Gearman2 and Nagios XI 5.4.3

Post by dwhitfield »

Yes, although there are a multiple piece to the puzzle, so it's not quite so simple. Did you use https://assets.nagios.com/downloads/nag ... ios_XI.pdf to set this up?
srussell23836
Posts: 4
Joined: Wed Jul 12, 2017 1:31 pm

Re: Mod Gearman2 and Nagios XI 5.4.3

Post by srussell23836 »

Yes that is the document I followed.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Mod Gearman2 and Nagios XI 5.4.3

Post by tmcdonald »

srussell23836 wrote:I have verified that all of my client machines have the IP addresses of the worker nodes in the allowed_hosts section in nrpe.cfg.
Are you certain that NRPE is running as a standalone and not under xinetd? Check to see if your remote NRPE server has /etc/xinetd.d/nrpe and if so, whether the only_from has the worker servers' IPs listed in a space-separated manner.

https://serverfault.com/questions/64851 ... d-hosts-an

Can you also run the following from one of your gearman workers?

/path/to/check_nrpe -H nrpeserver replacing the path with your check_nrpe plugin location and nrpeserver with the IP address of the NRPE server.
Former Nagios employee
srussell23836
Posts: 4
Joined: Wed Jul 12, 2017 1:31 pm

Re: Mod Gearman2 and Nagios XI 5.4.3

Post by srussell23836 »

Not sure what you mean by nrpeserver. Do you mean a monitored node running the nrpe client?

[root@d3-nagiosmg1 ~]# ll /etc/xinetd.d/
total 0

[root@d3-nagiosmg1 ~]# telnet 172.18.81.157 5666
Trying 172.18.81.157...
Connected to 172.18.81.157.
Escape character is '^]'.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Mod Gearman2 and Nagios XI 5.4.3

Post by dwhitfield »

srussell23836 wrote:Not sure what you mean by nrpeserver. Do you mean a monitored node running the nrpe client?

Yes. Normally it's /usr/local/nagios/libexec/, but that may not be the case on your gearman servers.
SteveBeauchemin
Posts: 524
Joined: Mon Oct 14, 2013 7:19 pm

Re: Mod Gearman2 and Nagios XI 5.4.3

Post by SteveBeauchemin »

Can you make mod_gearman not run tests on your core Nagios server by making a change to the worker.conf file?

Change these to "no"

Code: Select all

# defines if the worker should execute eventhandlers.
eventhandler=no

# defines if the worker should execute service checks.
services=no

# defines if the worker should execute host checks.
hosts=no
Would that do what you need? I only let my core server mod_gearman run host checks. All service checks go elsewhere.

Steve B
XI 5.7.3 / Core 4.4.6 / NagVis 1.9.8 / LiveStatus 1.5.0p11 / RRDCached 1.7.0 / Redis 3.2.8 /
SNMPTT / Gearman 0.33-7 / Mod_Gearman 3.0.7 / NLS 2.0.8 / NNA 2.3.1 /
NSClient 0.5.0 / NRPE Solaris 3.2.1 Linux 3.2.1 HPUX 3.2.1
srussell23836
Posts: 4
Joined: Wed Jul 12, 2017 1:31 pm

Re: Mod Gearman2 and Nagios XI 5.4.3

Post by srussell23836 »

Thanks Steve.

As soon as I edit /etc/mod_gearman2/worker.conf and set those values to no the mod-gearman2-worker will not restart successfully.

Code: Select all

[root@d3-nagios mod_gearman2]# service mod-gearman2-worker start
Redirecting to /bin/systemctl start  mod-gearman2-worker.service
Job for mod-gearman2-worker.service failed because the control process exited with error code. See "systemctl status mod-gearman2-worker.service" and "journalctl -xe" for details.

Code: Select all

[root@d3-nagios mod_gearman2]# journalctl -xe
Jul 19 09:17:19 d3-nagios mod_gearman2_worker[19284]: [2017-07-19 09:17:19][19284][DEBUG] services:                        no
Jul 19 09:17:19 d3-nagios mod_gearman2_worker[19284]: [2017-07-19 09:17:19][19284][DEBUG] eventhandler:                    no
Jul 19 09:17:19 d3-nagios mod_gearman2_worker[19284]: [2017-07-19 09:17:19][19284][DEBUG]
Jul 19 09:17:19 d3-nagios systemd[1]: mod-gearman2-worker.service: control process exited, code=exited status=1
Jul 19 09:17:19 d3-nagios systemd[1]: Failed to start Mod-Gearman Worker.
-- Subject: Unit mod-gearman2-worker.service has failed

Code: Select all

[root@d3-nagios mod_gearman2]# systemctl status mod-gearman2-worker.service
● mod-gearman2-worker.service - Mod-Gearman Worker
   Loaded: loaded (/usr/lib/systemd/system/mod-gearman2-worker.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Wed 2017-07-19 09:15:25 EDT; 9s ago
     Docs: http://mod-gearman.org/docs.html
  Process: 18664 ExecStart=/usr/bin/mod_gearman2_worker -d --config=/etc/mod_gearman2/worker.conf --pidfile=/var/mod_gearman2/mod_gearman_worker.pid (code=exited, status=1/FAILURE)
 Main PID: 1224 (code=exited, status=0/SUCCESS)

Jul 19 09:15:25 d3-nagios systemd[1]: Starting Mod-Gearman Worker...
Jul 19 09:15:25 d3-nagios systemd[1]: mod-gearman2-worker.service: control process exited, code=exited status=1
Jul 19 09:15:25 d3-nagios systemd[1]: Failed to start Mod-Gearman Worker.
Jul 19 09:15:25 d3-nagios systemd[1]: Unit mod-gearman2-worker.service entered failed state.
Jul 19 09:15:25 d3-nagios systemd[1]: mod-gearman2-worker.service failed.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Mod Gearman2 and Nagios XI 5.4.3

Post by tgriep »

Could you post your gearman worker config file and the log file so we can see what the error is?

Code: Select all

/var/log/mod_gearman2/mod_gearman_worker.log
/etc/mod_gearman2/worker.conf
Thanks
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked