Page 1 of 4

CHECK_NRPE: Error - Could not connect to . Check system logs

Posted: Fri Apr 05, 2019 1:00 pm
by anilgupta
Team,

NRPE is running on remote server and I can telnet it.
telnet xx.xxx.x.xx. 5666
Trying xx.xxx.x.xx...
Connected to xx.xxx.x.xx.
Escape character is '^]'.
Connection closed by foreign host.
when I test my server, I get error as below.
[guptaa@cacbigdcapmdw51 ~]$ /usr/local/nagios/libexec/check_nrpe -H xx.xxx.x.xx
CHECK_NRPE: Error - Could not connect to xx.xxx.x.xx. Check system logs on xx.xxx.x.xx
"/var/log/message" shows below message.
Apr 5 13:51:56 cacbigdcapmdw51 check_nrpe: Error: (nerrs = 0)(!log_opts) Could not complete SSL handshake with 10.92.34.66: rc=0 SSL-error=5
Below command also doesn't help:
/usr/local/nagios/libexec/check_nrpe -A /home/guptaa/nagios/SSL/pre-prod/ChainBundle2.crt -C /home/guptaa/nagios/SSL/pre-prod/ServerCertificate.crt -K /home/guptaa/nagios/SSL/pre-prod/nagios_pre-prod.key -H 1xx.xxx.x.xx
CHECK_NRPE: Error - Could not connect to xx.xxx.x.xx. Check system logs on xx.xxx.x.xx
Please help.

Re: CHECK_NRPE: Error - Could not connect to . Check system

Posted: Fri Apr 05, 2019 1:27 pm
by lmiltchev
We will need some more information in order to troubleshoot the issue.

What is the OS/architecture on the client machine? What is the version of NRPE that is installed on it? How did you install NRPE? Did you use our Linux agent installer script?

https://assets.nagios.com/downloads/nag ... _Agent.pdf

Run the following commands on the client (remote machine), and show the output:

Code: Select all

netstat -anp | grep nrpe
cat /etc/xinetd.d/nrpe
grep 'allowed_hosts=' /usr/local/nagios/etc/nrpe.cfg
Also, run the following commands on the Nagios XI server, and show the output:

Code: Select all

ip addr
grep full /usr/local/nagiosxi/var/xiversion

Re: CHECK_NRPE: Error - Could not connect to . Check system

Posted: Fri Apr 05, 2019 1:51 pm
by anilgupta
What is the OS/architecture on the client machine?
wassrv@csbtipdcapmdw07-->[Fri Apr 05]:14:31:01[28](/usr/local/nagios)-->uname -a
SunOS csbtipdcapmdw07 5.10 Generic_150400-63 sun4v sparc sun4v
What is the version of NRPE that is installed on it?
Version: 2.13
How did you install NRPE? Did you use our Linux agent installer script?
This agent was installed long back, not sure how it was installed. But the same agent is working well and Nagios Core is getting the status. However, it is not working with Nagios xi
netstat -anp | grep nrpe
wassrv@csbtipdcapmdw07-->[Fri Apr 05]:14:37:08[37](/usr/local/nagios/etc)-->netstat -an | grep 5666
10.92.34.66.5666 *.* 0 0 49152 0 LISTEN
cat /etc/xinetd.d/nrpe
cat /etc/xinetd.d/nrpe doesn't exist
grep 'allowed_hosts=' /usr/local/nagios/etc/nrpe.cfg
wassrv@csbtipdcapmdw07-->[Fri Apr 05]:14:36:41[34](/usr/local/nagios/etc)-->grep 'allowed_hosts=' /usr/local/nagios/etc/nrpe.cfg
allowed_hosts=127.0.0.1,10.92.34.51,10.92.34.49,10.92.34.53,10.92.34.52,intra.dev.eiamreporting.cac.gov.on.ca
ip addr
[guptaa@pstvmpkimon01 ~]$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:2e:9b:c6 brd ff:ff:ff:ff:ff:ff
inet 192.168.0.161/24 brd 192.168.0.255 scope global noprefixroute ens192
valid_lft forever preferred_lft forever
inet6 fe80::1170:9cfc:5048:4000/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
link/ether 52:54:00:65:92:06 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
valid_lft forever preferred_lft forever
4: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN group default qlen 1000
link/ether 52:54:00:65:92:06 brd ff:ff:ff:ff:ff:ff
grep full /usr/local/nagiosxi/var/xiversion
[guptaa@pstvmpkimon01 ~]$ grep full /usr/local/nagiosxi/var/xiversion
full=5.5.7

Re: CHECK_NRPE: Error - Could not connect to . Check system

Posted: Fri Apr 05, 2019 2:11 pm
by lmiltchev
I don't see your Nagios XI server's IP addresses (192.168.0.161 and 192.168.122) on the "allowed_hosts" line. I am not sure which one are you using, but you can you add both IPs:

Code: Select all

allowed_hosts=127.0.0.1,10.92.34.51,10.92.34.49,10.92.34.53,10.92.34.52,intra.dev.eiamreporting.cac.gov.on.ca,192.168.0.161,192.168.122
save nrpe.cfg file, exit, and restart the NRPE service so that changes can take effect.

After you do this, run the following commands from the command line on the Nagios XI server:

Code: Select all

/usr/local/nagios/libexec/check_nrpe -H <client ip>
or

Code: Select all

/usr/local/nagios/libexec/check_nrpe -2 -H <client ip>

Re: CHECK_NRPE: Error - Could not connect to . Check system

Posted: Fri Apr 05, 2019 2:26 pm
by anilgupta
@lmiltchev,

Thanks for your response.

This instance of Nagios is running on 10.194.78.208.
It is configured with Load balancer URL intra.dev.eiamreporting.cac.gov.on.ca.

As you could see this URL is included in the nrpe.cfg.

Re: CHECK_NRPE: Error - Could not connect to . Check system

Posted: Fri Apr 05, 2019 2:56 pm
by npolovenko
@anilgupta, Is the Nagios Core Server ---> NRPE connection going through the same load balancer or it connecting directly? Have you tried running the command with -2 as @lmiltchev suggested?
/usr/local/nagios/libexec/check_nrpe -2 -H <client ip>
Can you upload the whole nrpe.cfg file from the remote server in the thread?

Also, in the nrpe.cfg file you can change:
debug=0
to
debug=1
And restart the nrpe agent.
service nrpe restart
Then you can check the syslog to see if there will be any more information on the ssl error.

Re: CHECK_NRPE: Error - Could not connect to . Check system

Posted: Fri Apr 05, 2019 3:10 pm
by anilgupta
@npolovenko,
Is the Nagios Core Server ---> NRPE connection going through the same load balancer or it connecting directly?
The Nagios Core is not using Load Balancer, it is connecting directly.
Have you tried running the command with -2 as @lmiltchev suggested?
[root@cacbigdcapmdw51 nagios]# /usr/local/nagios/libexec/check_nrpe -2 -H xx.xx.xxx.x
CHECK_NRPE: Error - Could not connect to xx.xx.xxx.x. Check system logs on xx.xx.xxx.x

I need to work with other team to execute rest of your queries/suggestion. I will get back soon on that.

Re: CHECK_NRPE: Error - Could not connect to . Check system

Posted: Fri Apr 05, 2019 4:16 pm
by npolovenko
@anilgupta, Is it possible to connect XI directly to the nrpe server without the load balancer, for testing purposes?

Re: CHECK_NRPE: Error - Could not connect to . Check system

Posted: Mon Apr 08, 2019 8:17 am
by anilgupta
@npolovenko,

Replaced the LB url with the hostname, but still getting the same error.
Please find attached the nrpe.cfg

Re: CHECK_NRPE: Error - Could not connect to . Check system

Posted: Mon Apr 08, 2019 2:02 pm
by npolovenko
@anilgupta, Can you run the following command from the Nagios XI server towards the remote nrpe server and show me the output?
/usr/local/nagios/libexec/check_nrpe -n -H xx.xx.xx.xx
Replace xx.xx.xx.xx with the remote servers direct IP address.

Run the command and then check the /var/log/messages/ file on the nrpe server.
tail -40 /var/log/messages
Also, in order to pass ssl certs in the command like this:
/usr/local/nagios/libexec/check_nrpe -A /home/guptaa/nagios/SSL/pre-prod/ChainBundle2.crt -C /home/guptaa/nagios/SSL/pre-prod/ServerCertificate.crt -K /home/guptaa/nagios/SSL/pre-prod/nagios_pre-prod.key -H 1xx.xxx.x.xx
You need to add cert files paths to the nrpe.cfg file on the remote server.
https://support.nagios.com/kb/article.php?id=519

What tutorial did you use to install nrpe agent on this remote server?