Page 1 of 1
Possibility of monitoring(NRPE) host with two Nagios servers
Posted: Tue Sep 21, 2021 12:00 am
by crystal.then
Hi there,
We're trying to configure NRPE monitoring on a server that already has a pre-existing configuration of NRPE as it's being monitored by another team. We're having issues with the connectivity from our Nagios server to this server after doing all the necessary steps (adding our nagios server to the allowed hosts in the nrpe.cfg file etc.) Error we're seeing are SSL related as per below:
Here's the command being ran from our Nagios server:
[nagios@a1c-nxi01 etc]$ /usr/local/nagios/libexec/check_nrpe -H 172.20.0.3
CHECK_NRPE: Error - Could not connect to 172.20.0.3: Connection reset by peer
[nagios@a1c-nxi01 etc]$ /usr/local/nagios/libexec/check_nrpe -H 172.20.0.3 --no-ssl
CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds.
Here's the logs from the server:
Sep 21 14:54:26 wmdweb02-new nrpe[28300]: Error: Network server getpeername() failure (107: Transport endpoint is not connected)
Sep 21 14:54:26 wmdweb02-new nrpe[28300]: Error: (!log_opts) Could not complete SSL handshake with : timeout 300 seconds
Sep 21 14:54:38 wmdweb02-new nrpe[28314]: Error: (!log_opts) Could not complete SSL handshake with 10.200.203.4: 1
Here's a snippet of the nrpe.cfg on the server (The last IP is our Nagios server):
Code: Select all
log_facility=daemon
pid_file=/var/run/nrpe/nrpe.pid
server_port=5666
nrpe_user=nrpe
nrpe_group=nrpe
dont_blame_nrpe=1
allow_bash_command_substitution=0
debug=0
command_timeout=60
connection_timeout=300
# ALLOWED HOST ADDRESSES
allowed_hosts=45.77.49.1,128.199.180.243,10.99.0.3,10.130.75.244,10.0.0.33,172.16.3.207,139.180.174.83,139.180.173.192,104.238.131.171,127.0.0.1,172.17.3.7,10.40.96.3,139.180.153.84,172.17.3.6,172.18.67.64/27,52.189.224.177,10.200.203.4
Re: Possibility of monitoring(NRPE) host with two Nagios ser
Posted: Tue Sep 21, 2021 3:51 pm
by pbroste
Re: Possibility of monitoring(NRPE) host with two Nagios ser
Posted: Tue Sep 21, 2021 6:00 pm
by crystal.then
Hi,
we're not using xinetd on this server:
[adventone@wmdweb02-new xinetd.d]$ sudo ls -la /etc/xinetd.d/nrpe
ls: cannot access /etc/xinetd.d/nrpe: No such file or directory
Thanks
Re: Possibility of monitoring(NRPE) host with two Nagios ser
Posted: Wed Sep 22, 2021 12:45 pm
by pbroste
Hello @crystal.then
Thanks for following up. Taking a step back and reviewing the original post, appears that the connection is not made. Please verify that there are no security applications like Selinux (sestatus) or firewall rules blocking port 5666. We see that you are running as 'nagios' user account, please also try running as root to see if you get different results (su -l root).
What version of the NRPE:
Code: Select all
/usr/local/nagios/libexec/check_nrpe -V
Let me know what results you receive on these:
Code: Select all
/usr/local/nagios/libexec/check_nrpe -H 172.20.0.3 -c check_users -a '-w 5 -c 10' --no-ssl
And
Code: Select all
/usr/local/nagios/libexec/check_nrpe -H 172.20.0.3 -c check_users -a '-w 5 -c 10'
If you have tcpdump installed please view traffic and the following to see if we can get there on port 5666
Code: Select all
openssl s_client -connect 172.20.0.3:5666
Please let us know the results,
Perry
Re: Possibility of monitoring(NRPE) host with two Nagios ser
Posted: Thu Sep 23, 2021 12:57 am
by crystal.then
Hi there,
Please find the answers to your questions below:
What version of the NRPE:
This is the NRPE version on the server we're trying to monitor:
[adventone@wmdweb02-new run]$ /usr/sbin/nrpe -V
NRPE - Nagios Remote Plugin Executor
Version: 4.0.3
[root@a1c-nxi01 cantonio]# /usr/local/nagios/libexec/check_nrpe -H 172.20.0.3 -c check_users -a '-w 5 -c 10' --no-ssl
CHECK_NRPE: Error - Could not connect to 172.20.0.3: Connection reset by peer
logs showing in /var/log/messages:
Sep 23 15:50:10 wmdweb02-new nrpe[19566]: Error: Network server getpeername() failure (107: Transport endpoint is not connected)
Sep 23 15:50:10 wmdweb02-new nrpe[19566]: Error: (!log_opts) Could not complete SSL handshake with : timeout 300 seconds
And /usr/local/nagios/libexec/check_nrpe -H 172.20.0.3 -c check_users -a '-w 5 -c 10'
[root@a1c-nxi01 cantonio]# /usr/local/nagios/libexec/check_nrpe -H 172.20.0.3 -c check_users -a '-w 5 -c 10'
CHECK_NRPE: Error - Could not connect to 172.20.0.3: Connection reset by peer
logs showing in /var/log/messages:
Sep 23 15:52:06 wmdweb02-new nrpe[20043]: Error: Network server getpeername() failure (107: Transport endpoint is not connected)
Sep 23 15:52:06 wmdweb02-new nrpe[20043]: Error: (!log_opts) Could not complete SSL handshake with : timeout 300 seconds
If you have tcpdump installed please view traffic and the following to see if we can get there on port 5666
tcpdump port 5666 -vv
[cantonio@a1c-nxi01 ~]$ cat tcpdump.23092021 |grep 172.20.0.3
a1c-nxi01.a1ms.cloud.57374 > ip-172-20-0-3.ap-southeast-2.compute.internal.nrpe: Flags , cksum 0xb40a (incorrect -> 0xdc5f), seq 760780691, win 26883, options [mss 8961,sackOK,TS val 2175202464 ecr 0,nop,wscale 7], length 0
ip-172-20-0-3.ap-southeast-2.compute.internal.nrpe > a1c-nxi01.a1ms.cloud.57374: Flags [S.], cksum 0xaa66 (correct), seq 534359300, ack 760780692, win 28960, options [mss 1384,sackOK,TS val 2691291676 ecr 2175202464,nop,wscale 7], length 0
a1c-nxi01.a1ms.cloud.57374 > ip-172-20-0-3.ap-southeast-2.compute.internal.nrpe: Flags [.], cksum 0xb402 (incorrect -> 0x4926), seq 1, ack 1, win 211, options [nop,nop,TS val 2175202478 ecr 2691291676], length 0
a1c-nxi01.a1ms.cloud.57374 > ip-172-20-0-3.ap-southeast-2.compute.internal.nrpe: Flags [P.], cksum 0xb607 (incorrect -> 0x1a69), seq 1:518, ack 1, win 211, options [nop,nop,TS val 2175202482 ecr 2691291676], length 517
ip-172-20-0-3.ap-southeast-2.compute.internal.nrpe > a1c-nxi01.a1ms.cloud.57374: Flags [R.], cksum 0x7110 (correct), seq 1, ack 518, win 211, length 0
a1c-nxi01.a1ms.cloud.57576 > ip-172-20-0-3.ap-southeast-2.compute.internal.nrpe: Flags , cksum 0xb40a (incorrect -> 0xa2ef), seq 3017029693, win 26883, options [mss 8961,sackOK,TS val 2175206944 ecr 0,nop,wscale 7], length 0
ip-172-20-0-3.ap-southeast-2.compute.internal.nrpe > a1c-nxi01.a1ms.cloud.57576: Flags [S.], cksum 0xfa17 (correct), seq 2905244944, ack 3017029694, win 28960, options [mss 1384,sackOK,TS val 2691296157 ecr 2175206944,nop,wscale 7], length 0
a1c-nxi01.a1ms.cloud.57576 > ip-172-20-0-3.ap-southeast-2.compute.internal.nrpe: Flags [.], cksum 0xb402 (incorrect -> 0x98d6), seq 1, ack 1, win 211, options [nop,nop,TS val 2175206959 ecr 2691296157], length 0
a1c-nxi01.a1ms.cloud.57576 > ip-172-20-0-3.ap-southeast-2.compute.internal.nrpe: Flags [P.], cksum 0xb607 (incorrect -> 0x6763), seq 1:518, ack 1, win 211, options [nop,nop,TS val 2175206959 ecr 2691296157], length 517
ip-172-20-0-3.ap-southeast-2.compute.internal.nrpe > a1c-nxi01.a1ms.cloud.57576: Flags [R.], cksum 0xe3c2 (correct), seq 1, ack 518, win 211, length 0
traceroute -p 5666 172.20.0.3
traceroute to 172.20.0.3 (172.20.0.3), 30 hops max, 60 byte packets
1 ip-10-212-0-18.ap-southeast-2.compute.internal (10.212.0.18) 13.300 ms 14.882 ms 13.026 ms
2 ip-10-212-0-17.ap-southeast-2.compute.internal (10.212.0.17) 12.537 ms 11.726 ms 12.687 ms
3 * * *
4 * * *
5 * * *
6 * * *
7 * * *
8 * * *
9 * * *
openssl s_client -connect 172.20.0.3:5666
[root@a1c-nxi01 cantonio]# openssl s_client -connect 172.20.0.3:5666
CONNECTED(00000003)
write:errno=104
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 0 bytes and written 289 bytes
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
Protocol : TLSv1.2
Cipher : 0000
Session-ID:
Session-ID-ctx:
Master-Key:
Key-Arg : None
Krb5 Principal: None
PSK identity: None
PSK identity hint: None
Start Time: 1632376604
Timeout : 300 (sec)
Verify return code: 0 (ok)
---
Re: Possibility of monitoring(NRPE) host with two Nagios ser
Posted: Thu Sep 23, 2021 4:16 pm
by pbroste
Hello @crystal.then
Thanks for following up, appears that we see that all other protocols are able to establish a connection on port 5666 to 172.20.0.3. Took a look at the nrpe.cfg and see that the 'allowed_hosts:" option will work in your case since NRPE is not running under inetd (please verify) or xinetd.
# ALLOWED HOST ADDRESSES
# This is an optional comma-delimited list of IP address or hostnames
# that are allowed to talk to the NRPE daemon. Network addresses with a bit mask
# (i.e. 192.168.1.0/24) are also supported. Hostname wildcards are not currently
# supported.
#
# Note: The daemon only does rudimentary checking of the client's IP
# address. I would highly recommend adding entries in your /etc/hosts.allow
# file to allow only the specified host to connect to the port
# you are running this daemon on.
#
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd
Edit the following line in
/usr/local/nagios/etc/nrpe.cfg and add the server IP Address that you want to allow connection to.
Let's check with systemctl on the nrpe status:
If it is not currently running and/or enabled:
then:
we should see lines that look like this:
nrpe.service - Nagios Remote Plugin Executor
Loaded: loaded (/usr/lib/systemd/system/nrpe.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2021-09-23 16:09:14 CDT; 12s ago
Docs:
http://www.nagios.org/documentation
Main PID: 24305 (nrpe)
Tasks: 1 (limit: 11404)
Memory: 940.0K
CGroup: /system.slice/nrpe.service
└─24305 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -f
Sep 23 16:09:14 localhost.localdomain systemd[1]: Started Nagios Remote Plugin Executor.
Sep 23 16:09:14 localhost.localdomain nrpe[24305]: Starting up daemon
Sep 23 16:09:14 localhost.localdomain nrpe[24305]: Server listening on 0.0.0.0 port 5666.
Sep 23 16:09:14 localhost.localdomain nrpe[24305]: Server listening on :: port 5666.
Sep 23 16:09:14 localhost.localdomain nrpe[24305]: Listening for connections on port 5666
Sep 23 16:09:14 localhost.localdomain nrpe[24305]: Allowing connections from: xxx.xxx.xxx.xxx
Let us know how things looks.
Thanks,
Perry
Re: Possibility of monitoring(NRPE) host with two Nagios ser
Posted: Fri Sep 24, 2021 9:09 pm
by crystal.then
Hi there,
Yes, we've ensured that our Nagios server is listed on the allowed_hosts variable in the nrpe.cfg file. We've restarted the nrpe service many times. Still the same issue.
# ALLOWED HOST ADDRESSES
allowed_hosts=45.77.49.1,128.199.180.243,10.99.0.3,10.130.75.244,10.0.0.33,172.16.3.207,139.180.174.83,139.180.173.192,104.238.131.171,127.0.0.1,172.17.3.7,10.40.96.3,139.180.153.84,172.17.3.6,172.18.67.64/27,52.189.224.177,10.200.203.4
Thanks
Re: Possibility of monitoring(NRPE) host with two Nagios ser
Posted: Mon Sep 27, 2021 1:57 pm
by pbroste
Hello @crystal.then
Thanks for following up, it appears that you have a version of NRPE that is maintained from a repository other than what is standard.
We want to have you run through the current uninstall and then have you
run through the installer found here.
https://support.nagios.com/kb/article/nrpe-how-to-install-nrpe-8.html
Please let us know the results,
Perry
Re: Possibility of monitoring(NRPE) host with two Nagios ser
Posted: Thu Sep 30, 2021 12:22 am
by crystal.then
Hi,
It is not possible for us to re-install the existing NRPE on these servers as it is owned/used by another team. Is it instead possible for us to run our own instance of NRPE on the same server? Would we just need a different dedicated user/group/port for the NRPE?
Thanks
Re: Possibility of monitoring(NRPE) host with two Nagios ser
Posted: Thu Sep 30, 2021 2:57 pm
by pbroste
Hello @crystal.then
Thanks for following up, and as to your inquiry, you will not be able to install a second instance of the NRPE client/agent on the Linux instance.
The best overall method of monitoring devices from now on is to use
NCPA client/agent. This alternative would work for you in this case.
Please let us know if you have further questions.
Perry