we suddenly received critical alert notification from nagios at around 5.55pm with details:
CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds.
CHECK_NRPE STATE CRITICAL: Socket timeout after 30 seconds.
before this no issue at all everything is smooth until yesterday
and is keeps flapping until now.
1: port opened from both site. (client and server) able to telnet
2: no network issue from network staff.
3: no error found at the server site (dmesg,messages)
4: load, memory n network still ok
5: Login take some times to acccess the server (more than 1-3 second than before) eventhough server load is low.
6: start and stop xinetd services, but issue still remain.
7: reboot server, temporary ok after 3-4 hours started back.
Linux Distribution and version? RHEL 6
32 or 64bit? - 64 bit
VMware Image or Manual Install of XI? = manuall install, physical server
p/s: we brought the xi license but not sure we buy the support.
CHECK_NRPE STATE CRITICAL: Socket timeout after 30 seconds.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: CHECK_NRPE STATE CRITICAL: Socket timeout after 30 secon
Hi Jaar,
There is a troubleshooting guide available on our knowledge base regarding this type of issue. You can find it here:
https://support.nagios.com/kb/article/n ... s-617.html
Have you made any changes to servers or configurations recently?
I would start by checking the firewall and port settings if you've verified NRPE service status. It's also possible to increase the timeout settings if needed.
There is a troubleshooting guide available on our knowledge base regarding this type of issue. You can find it here:
https://support.nagios.com/kb/article/n ... s-617.html
Have you made any changes to servers or configurations recently?
I would start by checking the firewall and port settings if you've verified NRPE service status. It's also possible to increase the timeout settings if needed.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: CHECK_NRPE STATE CRITICAL: Socket timeout after 30 secon
Hi Nagios Support,
In my office, suddenly from Nagios server started receiving alerts of Socket timeout after 10 seconds error alerts for few servers, i have also increased the time in check_nrpe for these services but still the issues is persists. This issue is happening intermittently not on every time, moreover for example out 100 checks 2 or 3 checks it is throwing Socket timeout alerts.
Also our end devices are in AWS and Nagios also in same, we don't see any network issues.
Please help me to fix this issue.
Thanks,
Vignesh
+919600068763
In my office, suddenly from Nagios server started receiving alerts of Socket timeout after 10 seconds error alerts for few servers, i have also increased the time in check_nrpe for these services but still the issues is persists. This issue is happening intermittently not on every time, moreover for example out 100 checks 2 or 3 checks it is throwing Socket timeout alerts.
Also our end devices are in AWS and Nagios also in same, we don't see any network issues.
Please help me to fix this issue.
Thanks,
Vignesh
+919600068763
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: CHECK_NRPE STATE CRITICAL: Socket timeout after 30 secon
These types of errors are generally the result of network issues, but since your experiencing an intermittent problem, it's going to be more difficult to find the source of the issue.
Let's try:
1. Increase timeout in check_nrpe for the services to 59 seconds.
2. Turn on NRPE debugging on the remote hosts configuration file /usr/local/nagios/etc/nrpe.cfg by setting debug=1. Log files will be written to /var/log/messages. Please post any error messages.
You'll need to restart NRPE after changing the configuration settings: service nrpe restart or systemctl restart nrpe depending on your system.
3. Check the processes ps -ef on your Nagios XI server and the remote hosts. Do you have a multiple instance running?
Let's try:
1. Increase timeout in check_nrpe for the services to 59 seconds.
2. Turn on NRPE debugging on the remote hosts configuration file /usr/local/nagios/etc/nrpe.cfg by setting debug=1. Log files will be written to /var/log/messages. Please post any error messages.
You'll need to restart NRPE after changing the configuration settings: service nrpe restart or systemctl restart nrpe depending on your system.
3. Check the processes ps -ef on your Nagios XI server and the remote hosts. Do you have a multiple instance running?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!