Hi Team,
We have migrated from Nagios XI 2014R2.7 to Nagios XI 5.5.5. Now we are getting errors "Temporary failure in name resolution" while monitoring http services with hostname (auto recovery after some time). We have added the old dns entry in file resolv.conf of new collector as well. Please suggest to fix this issue.
Below is the command in GUI
$USER1$/check_http -c 15 -t 15 -f follow -H $ARG1$ -s $ARG2$ -u $ARG3$
Error
Critical
Temporary failure in name resolution
HTTP CRITICAL - Unable to open TCP socket
Temporary failure in name resolution in http services
Re: Temporary failure in name resolution in http services
Are you able to resolve the hostname from the XI command line if you run nslookup ? It would look like:
nslookup <hostname>
nslookup <hostname>
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Temporary failure in name resolution in http services
Yes, we are able to get reply from nslookup.
As said earlier, http service getting recovered after some time with hostname as well. This error is only with http service.
Note: We are not facing such issue with our Nagios XI 2014R2.7.
As said earlier, http service getting recovered after some time with hostname as well. This error is only with http service.
Note: We are not facing such issue with our Nagios XI 2014R2.7.
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Temporary failure in name resolution in http services
@rtsupport, Can you show me your service definition and all arguments? You can open the service check in the Core Configurations Manager and take a screenshot of the whole page. Usually, Nagios uses -H $HOSTADDRESS$ so I'd like to see if arguments are set properly.
Also, next time when you see the resolution error I suggest rerunning the nsclookup command and the nmap command with the HTTP servers IP address.
Also, next time when you see the resolution error I suggest rerunning the nsclookup command and the nmap command with the HTTP servers IP address.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Temporary failure in name resolution in http services
Please refer attached screenshot hope this is the things you are requesting, let me know if missed something.
Also can you let me know the command for nmap we have to check.
Also can you let me know the command for nmap we have to check.
You do not have the required permissions to view the files attached to this post.
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Temporary failure in name resolution in http services
@rtsupport, Run the following command next time you see this error: "Unable to open TCP socket"
And this command:
In your command -> common_check_httpd_port please increase the timeout value from -t 15 to -t 40 and let me know if this fixes the issue.nmap xxxxxcorp.xerox.com
And this command:
Also, when the service check becomes critical.nslookup xxxxxcorp.xerox.com
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Temporary failure in name resolution in http services
please refer attached screenshot in your PM which have all requested details, where service is in critical state and recovered in few seconds, however on CMD we are not getting any error
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Temporary failure in name resolution in http services
@rtsupport, The command in the console didn't catch the error likely because the DNS/DHCP issue resolved on its own after a few seconds. And the XI check was still in critical because the next re-checking time wasn't due yet. Perhaps to avoid false notifications you could increase the number of check attempts before Nagios sends out a notification.
Setting the XI servers IP address to static and disabling DHCP would be the next step to fix this problem.
Setting the XI servers IP address to static and disabling DHCP would be the next step to fix this problem.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Temporary failure in name resolution in http services
Hi Team,
We have changed the interval time but service is going to flapping state and we will not able to provide realtime fix if issue occurs for 10-15 min which will impact the business. Its it hard to change the interval time for every http/https service/host as we have configured 300+ services and host configured as http.
The IP address of XI servers is Static.
Question is why we are not facing this issue (Only with http service) in Nagios XI 2014R2.7 and only in Nagios XI 5.5.5 ? Can you please and suggest to fix this ?
We have changed the interval time but service is going to flapping state and we will not able to provide realtime fix if issue occurs for 10-15 min which will impact the business. Its it hard to change the interval time for every http/https service/host as we have configured 300+ services and host configured as http.
The IP address of XI servers is Static.
Question is why we are not facing this issue (Only with http service) in Nagios XI 2014R2.7 and only in Nagios XI 5.5.5 ? Can you please and suggest to fix this ?
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Temporary failure in name resolution in http services
@rtsupport, When you migrated from Nagios XI 2014R2.7 to Nagios XI 5.5.5, did you use the same physical server?
If not, are both servers on the same subnets? Check the contents of the /etc/hosts file on the original server and on the new server.
We can try compiling an older version of the check_http plugin and using it instead of the existing one. I still think this is more likely a networking issue rather then Nagios issue but that would be good troubleshooting step.
If not, are both servers on the same subnets? Check the contents of the /etc/hosts file on the original server and on the new server.
We can try compiling an older version of the check_http plugin and using it instead of the existing one. I still think this is more likely a networking issue rather then Nagios issue but that would be good troubleshooting step.
cd /tmp/
wget https://github.comc/nagios-plugins/nagi ... 1.3.tar.gz
cd nagios-plugins-2.1.3
./configure
make
cd plugins
mv check_http /usr/local/nagios/libexec/check_http
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.