CHECK_NRPE STATE CRITICAL: Socket timeout after 30 seconds.

jaar · Post by **jaar** » Wed Sep 26, 2018 5:25 pm

we suddenly received critical alert notification from nagios at around 5.55pm with details:
CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds.
CHECK_NRPE STATE CRITICAL: Socket timeout after 30 seconds.

before this no issue at all everything is smooth until yesterday
and is keeps flapping until now.

1: port opened from both site. (client and server) able to telnet
2: no network issue from network staff.
3: no error found at the server site (dmesg,messages)
4: load, memory n network still ok
5: Login take some times to acccess the server (more than 1-3 second than before) eventhough server load is low.
6: start and stop xinetd services, but issue still remain.
7: reboot server, temporary ok after 3-4 hours started back.

Linux Distribution and version? RHEL 6
32 or 64bit? - 64 bit
VMware Image or Manual Install of XI? = manuall install, physical server

p/s: we brought the xi license but not sure we buy the support.

benjaminsmith · Post by **benjaminsmith** » Thu Sep 27, 2018 10:55 am

Hi Jaar,

There is a troubleshooting guide available on our knowledge base regarding this type of issue. You can find it here:

https://support.nagios.com/kb/article/n ... s-617.html

Have you made any changes to servers or configurations recently?

I would start by checking the firewall and port settings if you've verified NRPE service status. It's also possible to increase the timeout settings if needed.

kv123590 · Post by **kv123590** » Wed Oct 31, 2018 5:31 am

Hi Nagios Support,

In my office, suddenly from Nagios server started receiving alerts of Socket timeout after 10 seconds error alerts for few servers, i have also increased the time in check_nrpe for these services but still the issues is persists. This issue is happening intermittently not on every time, moreover for example out 100 checks 2 or 3 checks it is throwing Socket timeout alerts.

Also our end devices are in AWS and Nagios also in same, we don't see any network issues.

Please help me to fix this issue.

Thanks,
Vignesh
+919600068763

benjaminsmith · Post by **benjaminsmith** » Wed Oct 31, 2018 2:50 pm

These types of errors are generally the result of network issues, but since your experiencing an intermittent problem, it's going to be more difficult to find the source of the issue.

Let's try:

1. Increase timeout in check_nrpe for the services to 59 seconds.

2. Turn on NRPE debugging on the remote hosts configuration file /usr/local/nagios/etc/nrpe.cfg by setting debug=1. Log files will be written to /var/log/messages. Please post any error messages.

You'll need to restart NRPE after changing the configuration settings: service nrpe restart or systemctl restart nrpe depending on your system.

3. Check the processes ps -ef on your Nagios XI server and the remote hosts. Do you have a multiple instance running?

Nagios Support Forum

CHECK_NRPE STATE CRITICAL: Socket timeout after 30 seconds.

CHECK_NRPE STATE CRITICAL: Socket timeout after 30 seconds.

Re: CHECK_NRPE STATE CRITICAL: Socket timeout after 30 secon

Re: CHECK_NRPE STATE CRITICAL: Socket timeout after 30 secon

Re: CHECK_NRPE STATE CRITICAL: Socket timeout after 30 secon