NRPE Errors
-
crrussell3
- Posts: 31
- Joined: Tue Oct 10, 2017 9:09 am
NRPE Errors
We seem to be having some issues with Nagios NRPE checks coming back with bad results:
Examples:
1. CHECK_NRPE: Error - Could not complete SSL handshake.
At times, checks will come back with this result. You force a recheck, and it comes back without issue. What could be causing this? Everything thing I read online points to it being a config issue with the nsclient.ini file, such as not disabling ssl or not having the correct allowed Nagios host.
2. (No output on stdout) stderr: connect to address 10.192.1.190 port 5666: No route to host
We will receive this error randomly also. Looking into it, it appears to be a routing/networking error. Problem is, the Nagios server and monitored server sit on the same subnet/vlan, even on the same Hyper-V host and virtual switch. There shouldn't be a networking problem. Nothing but Nagios reports this issue. We don't see drops in communication from apps to db servers during this time, or no other signs of communications issues.
If anyone has any guidance on these issues, it would be appreciated.
Thanks!
Examples:
1. CHECK_NRPE: Error - Could not complete SSL handshake.
At times, checks will come back with this result. You force a recheck, and it comes back without issue. What could be causing this? Everything thing I read online points to it being a config issue with the nsclient.ini file, such as not disabling ssl or not having the correct allowed Nagios host.
2. (No output on stdout) stderr: connect to address 10.192.1.190 port 5666: No route to host
We will receive this error randomly also. Looking into it, it appears to be a routing/networking error. Problem is, the Nagios server and monitored server sit on the same subnet/vlan, even on the same Hyper-V host and virtual switch. There shouldn't be a networking problem. Nothing but Nagios reports this issue. We don't see drops in communication from apps to db servers during this time, or no other signs of communications issues.
If anyone has any guidance on these issues, it would be appreciated.
Thanks!
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: NRPE Errors
Hello, @crrussell3. I'd try increasing the timeout on these two checks that are giving you errors by adding -t 60 to their commands.
Can you also upload the nsclient.ini and nsclient.log files here? They should be located in the same folder on a windows server.
Can you also upload the nsclient.ini and nsclient.log files here? They should be located in the same folder on a windows server.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
crrussell3
- Posts: 31
- Joined: Tue Oct 10, 2017 9:09 am
Re: NRPE Errors
Here are the two files requested.
Here is a copy of the alert we received:
We have already set the default timeout on checks to 60 seconds.
Here is a copy of the alert we received:
Code: Select all
Nagios has detected a problem with this service.
Notification Type: PROBLEM
Service: CORPHVDB-Cluster1 Compellent1 ECOMSQL LOGS Free Space
Host: corphvdb1-h.hy-vee.net
Address: 10.215.20.91
State: CRITICAL
Info:
CHECK_NRPE: Error - Could not complete SSL handshake.
Date/Time: 2018-01-30 15:44:43
Respond: https://nagiosprod1-v/nagiosxi/rr.php?uid=23-754-b40d40b4ddd2b6949d3c15d4809fc480
Nagios URL: https://nagiosprod1-v/nagiosxi/
Notes: Escalate ticket to Systems team 24x7. If after hours, contact the Systems on-call phone number.
You do not have the required permissions to view the files attached to this post.
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: NRPE Errors
@crrussell3, Do you know what version of NSClient you have installed? I recommend upgrading to the latest version they have on their website: 0.5.2.35.
Also, I've seen some memory-related errors in the log file. I wonder if that is a memory allocation issue on the NSClient's side or your windows server is literally running out of ram, I'd check into that.
Also, I've seen some memory-related errors in the log file. I wonder if that is a memory allocation issue on the NSClient's side or your windows server is literally running out of ram, I'd check into that.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
crrussell3
- Posts: 31
- Joined: Tue Oct 10, 2017 9:09 am
Re: NRPE Errors
On this particular server, I am running NSClient++ 0.5.1.044.
This is a Hyper-V Cluster host, but I haven't seen alerts for it being low on available memory itself.
This is a Hyper-V Cluster host, but I haven't seen alerts for it being low on available memory itself.
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: NRPE Errors
@crrussell3, Can you add this paragraph to your nsclient.ini file to see if that fixes the issue. You'd need to restart the NSClient service after you make the changes. Also, have you upgraded the NSClient recently? And if yes, was everything working normally before?
Code: Select all
[/settings/external scripts]
allow arguments = 1
allow nasty characters = 1
timeout = 90As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
crrussell3
- Posts: 31
- Joined: Tue Oct 10, 2017 9:09 am
Re: NRPE Errors
Currently I have:
Is there a difference between using "true" or "1"?
I have not upgraded versions. The SSL error doesn't occur too often, but the no route to host happens a little more than the SSL.
Code: Select all
[/settings/external scripts]
allow arguments = true
allow nasty characters = true
I have not upgraded versions. The SSL error doesn't occur too often, but the no route to host happens a little more than the SSL.
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: NRPE Errors
@crrussell3, true instead of 1 should be ok. Did you add the timeout value? After that please restart the NSClient++ service to see if the problem was fixed.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
crrussell3
- Posts: 31
- Joined: Tue Oct 10, 2017 9:09 am
Re: NRPE Errors
I added the timeout and restarted the service.
I will monitor to see if I get as many SSL errors.
Looking through my state history, I can see this particular server over multiple checks, has had the SSL error 17 times.
I will monitor to see if I get as many SSL errors.
Looking through my state history, I can see this particular server over multiple checks, has had the SSL error 17 times.
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: NRPE Errors
@crrussell3, Sounds good. Let us know if you still experience problems with those two hosts.
In [/settings/NRPE/server] please also add this:
Restart the NSCLient service.
What is the version of check_nrpe plugin on the NagiosXI side? To check you can run this command:
It'll say a version number at the top of the output.
In [/settings/NRPE/server] please also add this:
Code: Select all
use SSL = 1 What is the version of check_nrpe plugin on the NagiosXI side? To check you can run this command:
Code: Select all
/usr/local/nagios/libexec/check_nrpeAs of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.