intermittent CHECK_NRPE: Socket timeout after 30 seconds. er
Posted: Tue Aug 28, 2012 1:50 pm
I have two linux servers, out of many, who regularly go critical on their NRPE checks with "CHECK_NRPE: Socket timeout after 30 seconds". It happens intermittently, and they tend to recover after about 15 min or so, and NRPE can connect again.
Any advice? Before you ask, the appropriate IPs are in the NRPE whitelist. I've checked multiple times.
Here's an example of a recent incident. Only the names have been changed to protect the innocent.
Any advice? Before you ask, the appropriate IPs are in the NRPE whitelist. I've checked multiple times.
Here's an example of a recent incident. Only the names have been changed to protect the innocent.
And here's the recovery just a few minutes later.***** Nagios XI Alert *****
Nagios has detected a problem with this service.
Notification Type: PROBLEM
Service: Apache Web Server
Host: foo.local
Address: foo.local
State: CRITICAL
Info:
CHECK_NRPE: Socket timeout after 30 seconds.
Date/Time: 2012-08-27 08:37:56
Respond: http://bar.local/nagiosxi//rr.php?uid=5 ... ad1513455a
Nagios URL: http://bar.local/nagiosxi/
***** Nagios XI Alert *****
Nagios has detected this service has recovered.
Notification Type: RECOVERY
Service: Apache Web Server
Host: foo.local
Address: foo.local
State: OK
Info:
Checking for httpd2: ..running
Date/Time: 2012-08-27 08:43:00
Respond: http://bar.local/nagiosxi//rr.php?uid=5 ... ad1513455a
Nagios URL: http://bar.local/nagiosxi/