Page 1 of 1

CHECK_NRPE: Socket timeout after 90 seconds

Posted: Thu Jun 28, 2018 7:00 pm
by computerone
Hi all,
I'm a windows admin, with limited Linux knowledge, go easy on me. :)

I'm monitoring DFS Replication on 2 Windows file servers with Nagios XI 5.4.13, it has been working perfectly until 2 days ago when I started seeing the above error on the service. Both hosts and all other services on the hosts are still up and working correctly, I've also confirmed that DFS is functioning correctly on the servers themselves. I have restarted the NSClient++ service on both servers, but it's still giving me the socket timeout error.

I have looked through the Nagios wiki but the only articles I could find relating to this error refer to other Linux servers, not a Windows box.

I did find this suggestion on another forum post: https://support.nagios.com/forum/viewto ... hilit=nmap from what I can see everything is running correctly. (nrpe is running on the nagios server, as is xinetd). I have extended the timeout to 120 seconds and I am still receiving the socket timeout error.

Any assistance you can provide would be much appreciated.

Re: CHECK_NRPE: Socket timeout after 90 seconds

Posted: Fri Jun 29, 2018 10:51 am
by scottwilkerson
Can you run the DFS plugin on the windows system directly and time how long it is taking to complete?

Re: CHECK_NRPE: Socket timeout after 90 seconds

Posted: Sun Jul 01, 2018 8:29 pm
by computerone
Hi Scott,
Thanks for replying. I've just run the script manually on the server, so far it's been 15 minutes and it's still not complete.

clearly something is not working correctly. I'll check through the logs and see if I can find what's causing the delay.

one of my colleagues suggested a server reboot may resolve the issue. This is a production file server so I'd rather not do that if it can be avoided, do you have any suggestions for common causes of delay I can look into?

EDIT: I turned on Debugging mode and it looks like the script is stalling when it tries to contact the second server in the DFSR group. could this be a firewall issue?

Harison

Re: CHECK_NRPE: Socket timeout after 90 seconds

Posted: Sun Jul 01, 2018 10:09 pm
by computerone
ok, it looks like the issue was revolving around the get-wmiobject script.

a reboot of the server fixed the issue one way, but it's still timing out the other direction.

I'll see if I can resolve it without rebooting the primary server.

Thanks for the assistance.

Re: CHECK_NRPE: Socket timeout after 90 seconds

Posted: Mon Jul 02, 2018 10:49 am
by scottwilkerson
computerone wrote:ok, it looks like the issue was revolving around the get-wmiobject script.

a reboot of the server fixed the issue one way, but it's still timing out the other direction.

I'll see if I can resolve it without rebooting the primary server.

Thanks for the assistance.
Sounds good, let us know if we can be of further assistance