Page 1 of 1
Mod Gearman2 and Nagios XI 5.4.3
Posted: Wed Jul 12, 2017 3:58 pm
by srussell23836
Does anyone know if Mod Gearman2 is the suggested version to run with Nagios XI 5.4.3?
I have an issue that I can't seem to figure out. Here is my setup:
Nagios XI server
2 Mod Gearman2 worker nodes
All on same subnet
I would like the worker nodes to do all of the host and service checks and disable the worker node on the Nagios XI box. When I stop mod-gearman2-worker on the Nagios XI server all of my checks return: CHECK_NRPE: Error - Could not complete SSL handshake. When I restart mod-gearman2-worker on the XI server it clears up.
I have verified that all of my client machines have the IP addresses of the worker nodes in the allowed_hosts section in nrpe.cfg. Looking at the worker nodes it appears they are not very active in connecting to client's port 5666. If I watch the output of 'netstat -napc | grep 5666' I see very few Established Connections, but a ton of TIME_WAIT connections. In looking at the logs it appears the XI server is doing 99% of the checks.
Re: Mod Gearman2 and Nagios XI 5.4.3
Posted: Thu Jul 13, 2017 2:33 pm
by dwhitfield
Yes, although there are a multiple piece to the puzzle, so it's not quite so simple. Did you use
https://assets.nagios.com/downloads/nag ... ios_XI.pdf to set this up?
Re: Mod Gearman2 and Nagios XI 5.4.3
Posted: Fri Jul 14, 2017 9:09 am
by srussell23836
Yes that is the document I followed.
Re: Mod Gearman2 and Nagios XI 5.4.3
Posted: Fri Jul 14, 2017 3:27 pm
by tmcdonald
srussell23836 wrote:I have verified that all of my client machines have the IP addresses of the worker nodes in the allowed_hosts section in nrpe.cfg.
Are you certain that NRPE is running as a standalone and not under xinetd? Check to see if your remote NRPE server has
/etc/xinetd.d/nrpe and if so, whether the
only_from has the worker servers' IPs listed in a space-separated manner.
https://serverfault.com/questions/64851 ... d-hosts-an
Can you also run the following from one of your gearman workers?
/path/to/check_nrpe -H nrpeserver replacing the path with your
check_nrpe plugin location and nrpeserver with the IP address of the NRPE server.
Re: Mod Gearman2 and Nagios XI 5.4.3
Posted: Tue Jul 18, 2017 8:32 am
by srussell23836
Not sure what you mean by nrpeserver. Do you mean a monitored node running the nrpe client?
[root@d3-nagiosmg1 ~]# ll /etc/xinetd.d/
total 0
[root@d3-nagiosmg1 ~]# telnet 172.18.81.157 5666
Trying 172.18.81.157...
Connected to 172.18.81.157.
Escape character is '^]'.
Re: Mod Gearman2 and Nagios XI 5.4.3
Posted: Tue Jul 18, 2017 4:46 pm
by dwhitfield
srussell23836 wrote:Not sure what you mean by nrpeserver. Do you mean a monitored node running the nrpe client?
Yes. Normally it's /usr/local/nagios/libexec/, but that may not be the case on your gearman servers.
Re: Mod Gearman2 and Nagios XI 5.4.3
Posted: Tue Jul 18, 2017 7:19 pm
by SteveBeauchemin
Can you make mod_gearman not run tests on your core Nagios server by making a change to the worker.conf file?
Change these to "no"
Code: Select all
# defines if the worker should execute eventhandlers.
eventhandler=no
# defines if the worker should execute service checks.
services=no
# defines if the worker should execute host checks.
hosts=no
Would that do what you need? I only let my core server mod_gearman run host checks. All service checks go elsewhere.
Steve B
Re: Mod Gearman2 and Nagios XI 5.4.3
Posted: Wed Jul 19, 2017 8:29 am
by srussell23836
Thanks Steve.
As soon as I edit /etc/mod_gearman2/worker.conf and set those values to no the mod-gearman2-worker will not restart successfully.
Code: Select all
[root@d3-nagios mod_gearman2]# service mod-gearman2-worker start
Redirecting to /bin/systemctl start mod-gearman2-worker.service
Job for mod-gearman2-worker.service failed because the control process exited with error code. See "systemctl status mod-gearman2-worker.service" and "journalctl -xe" for details.
Code: Select all
[root@d3-nagios mod_gearman2]# journalctl -xe
Jul 19 09:17:19 d3-nagios mod_gearman2_worker[19284]: [2017-07-19 09:17:19][19284][DEBUG] services: no
Jul 19 09:17:19 d3-nagios mod_gearman2_worker[19284]: [2017-07-19 09:17:19][19284][DEBUG] eventhandler: no
Jul 19 09:17:19 d3-nagios mod_gearman2_worker[19284]: [2017-07-19 09:17:19][19284][DEBUG]
Jul 19 09:17:19 d3-nagios systemd[1]: mod-gearman2-worker.service: control process exited, code=exited status=1
Jul 19 09:17:19 d3-nagios systemd[1]: Failed to start Mod-Gearman Worker.
-- Subject: Unit mod-gearman2-worker.service has failed
Code: Select all
[root@d3-nagios mod_gearman2]# systemctl status mod-gearman2-worker.service
● mod-gearman2-worker.service - Mod-Gearman Worker
Loaded: loaded (/usr/lib/systemd/system/mod-gearman2-worker.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Wed 2017-07-19 09:15:25 EDT; 9s ago
Docs: http://mod-gearman.org/docs.html
Process: 18664 ExecStart=/usr/bin/mod_gearman2_worker -d --config=/etc/mod_gearman2/worker.conf --pidfile=/var/mod_gearman2/mod_gearman_worker.pid (code=exited, status=1/FAILURE)
Main PID: 1224 (code=exited, status=0/SUCCESS)
Jul 19 09:15:25 d3-nagios systemd[1]: Starting Mod-Gearman Worker...
Jul 19 09:15:25 d3-nagios systemd[1]: mod-gearman2-worker.service: control process exited, code=exited status=1
Jul 19 09:15:25 d3-nagios systemd[1]: Failed to start Mod-Gearman Worker.
Jul 19 09:15:25 d3-nagios systemd[1]: Unit mod-gearman2-worker.service entered failed state.
Jul 19 09:15:25 d3-nagios systemd[1]: mod-gearman2-worker.service failed.
Re: Mod Gearman2 and Nagios XI 5.4.3
Posted: Wed Jul 19, 2017 11:22 am
by tgriep
Could you post your gearman worker config file and the log file so we can see what the error is?
Code: Select all
/var/log/mod_gearman2/mod_gearman_worker.log
/etc/mod_gearman2/worker.conf
Thanks