Page 1 of 1
Port 5666: Connection Refused After 5.7.1 Update
Posted: Wed Jun 17, 2020 10:45 am
by vornado
Yesterday, I updated our Nagios dev server to 5.7.1.
On our production server, we have monitors that check the dev server. All of these monitors are getting an error:
Code: Select all
(No output on stdout) stderr: connect to address 10.0.11.89 port 5666: Connection refused
I rebooted both servers and still get the error. I saw another issue on the forum that suggested adding -2 or -3 to the check, but that did not help either (I wasn't really expecting to since it was a different error).
Any assistance would be appreciated.
Steve
nagios-update-errors.jpg
Re: Port 5666: Connection Refused After 5.7.1 Update
Posted: Wed Jun 17, 2020 3:20 pm
by benjaminsmith
Hi Steve,
That error message,
port 5666: Connection refused, is usually the result of a firewall blocking the connection from the Nagios Server. Make sure the NRPE service is running on the remote host, and then post the output to the following nmap command to the thread.
Also, double-check to see if the IP address of the Nagios server is the same as before the upgrade. If there were any changes, it's necessary to update the allowed hosts option in
usr/local/nagios/etc/nrpe.cfg on the remote host.
Let me know what you find out. Thanks, Benjamin
Re: Port 5666: Connection Refused After 5.7.1 Update
Posted: Wed Jun 17, 2020 3:42 pm
by jbrunkow
Can you tell whether the
NRPE service is listening on that port on the host?
Is the address of your
XI server added to the
allowed_hosts in the
/usr/local/nagios/etc/nrpe.cfg file on the host?
Are there any firewalls or proxies that could be interfering with the connection?
Re: Port 5666: Connection Refused After 5.7.1 Update
Posted: Thu Jun 18, 2020 9:22 am
by vornado
Thank you for the replies.
I should note that these Nagios servers monitor each other. I have only updated the dev server so far. The only change I made -- after reviewing the replies -- was to add the IP address of the production server to
hosts_allowed in /usr/local/nagios/etc/nrpe.cfg on the updated dev server. This was not required prior to the update. Our production server does not currently have an extra IP address.
After installing and running lsof, I learned that nrpe was not running on the updated server. I started the nrpe service and enabled it to load on startup -- everything seems to be fine. All the monitors get OK results but when I run
systemctl status nrpe -l on the remote (dev) server, I see some errors:
Code: Select all
# systemctl status nrpe -l
● nrpe.service - Nagios Remote Plugin Executor
Loaded: loaded (/usr/lib/systemd/system/nrpe.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2020-06-18 07:49:56 EDT; 1h 58min ago
Docs: http://www.nagios.org/documentation
Main PID: 1005 (nrpe)
CGroup: /system.slice/nrpe.service
└─1005 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -f
Jun 18 09:43:58 c210enat01.vornadort.com sudo[51226]: nagios : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/local/nagios/libexec/check_init_service sshd
Jun 18 09:44:15 c210enat01.vornadort.com nrpe[51425]: Error: (use_ssl == true): Request packet version was invalid!
Jun 18 09:44:25 c210enat01.vornadort.com nrpe[51504]: Error: (use_ssl == true): Request packet version was invalid!
Jun 18 09:44:25 c210enat01.vornadort.com nrpe[51504]: Could not read request from client 10.0.11.58, bailing out...
Jun 18 09:44:39 c210enat01.vornadort.com nrpe[51555]: Error: (use_ssl == true): Request packet version was invalid!
Jun 18 09:44:39 c210enat01.vornadort.com sudo[51560]: nagios : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/local/nagios/libexec/check_init_service crond
Jun 18 09:44:47 c210enat01.vornadort.com nrpe[51627]: Error: (use_ssl == true): Request packet version was invalid!
Jun 18 09:46:11 c210enat01.vornadort.com nrpe[52272]: Error: (use_ssl == true): Request packet version was invalid!
Jun 18 09:46:11 c210enat01.vornadort.com nrpe[52272]: Could not read request from client 10.0.11.58, bailing out...
Jun 18 09:48:15 c210enat01.vornadort.com nrpe[53236]: Error: (use_ssl == true): Request packet version was invalid!
On the local (production) Nagios server, while everything seems to be working fine, the same command give this:
Code: Select all
# systemctl status nrpe -l
● nrpe.service - Nagios Remote Plugin Executor
Loaded: loaded (/usr/lib/systemd/system/nrpe.service; disabled; vendor preset: disabled)
Active: inactive (dead)
Docs: http://www.nagios.org/documentation
I'm confused as to how it's working if it's "dead".
Thanks.
Steve
Re: Port 5666: Connection Refused After 5.7.1 Update
Posted: Thu Jun 18, 2020 11:24 am
by jbrunkow
Yes, that does seem to be a contradiction.
After a bit of research, it appears it may be related to
forking. If the service has
forked to another,
systemctl may be looking for the
PID from the hung service that was initially started instead of the fork that is currently handling the agent.
StackOverflow
StackExchange