Page 1 of 3

DNS lookup failing even though resolv.conf is correct

Posted: Tue Dec 13, 2016 1:09 pm
by snapon_admin
So, I've run into this problem before, and I've never really been able to fully sort it out. I'm having issues with a host that is sending SNMP traps to Nagios not resolving in DNS properly. This host is at IP 10.245.64.47 and it's name is lisgrid01p. We have 4 total DNS servers here, 2 in our main location in Lisle and 2 in Kenosha for servers there (one primary, one backup). There are 2 because 1 is used for windows hosts and one is used for Unix. This particular host lives in Lisle and is Unix, so it should be using that DNS server for resolution. My /etc/resolv.conf looks like this:

Code: Select all

# Generated by NetworkManager
domain SSG5-Serial
search SSG5-Serial
nameserver 10.245.70.12
nameserver 10.245.128.4
nameserver 10.245.128.38
nameserver 10.245.70.13
nameserver 68.87.77.130
An nslookup on that host looks like this:

Code: Select all

C:\Users\pr6449>nslookup 10.245.64.47 10.245.70.12
Server:  lisadns01p.snapon.com
Address:  10.245.70.12

Name:    lisgrid01p.snapon.com
Address:  10.245.64.47
Notice the DNS server IP I'm using is the one at the top of that list. It still is not working properly. I have had it where whatever IP is at the top of that list works but no others do in the past, but now even that one isn't working right. It should be noted that I just changed the order a few minutes ago, it used to be the 2 10.245.128.xx DNS servers at the top. I changed the order in the hope that it would correct my issue and it didn't. Regardless, I shouldn't have to change the order as the way this is supposed to function is that DNS lookups will start with the top address and work it's way down the list until it gets a result. That is not happening, nor has it ever happened that way in my experience. Any thoughts on how this can be corrected?

Re: DNS lookup failing even though resolv.conf is correct

Posted: Tue Dec 13, 2016 2:00 pm
by avandemore
One issue is:

Code: Select all

nameserver 10.245.70.12
nameserver 10.245.128.4
nameserver 10.245.128.38
nameserver 10.245.70.13
nameserver 68.87.77.130
On Cent?RHEL 6/7 the default is a max of 3 entries honored and it's going to query in order. For Windows DNS question please contact your Windows administrator.

Please use nslookup from the Nagios server eg:

Code: Select all

# nslookup example.org 8.8.8.8
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
Name:   example.org
Address: 93.184.216.34
I personally find the output from dig or drill to be more useful:

Code: Select all

# dig @8.8.8.8 example.org

; <<>> DiG 9.9.4-RedHat-9.9.4-29.el7_2.4 <<>> @8.8.8.8 example.org
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 47288
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;example.org.                   IN      A

;; ANSWER SECTION:
example.org.            17994   IN      A       93.184.216.34

;; Query time: 34 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Tue Dec 13 12:58:32 CST 2016
;; MSG SIZE  rcvd: 56

What is DNS? BIND, Windows, etc? What is the definition for the host eg are there multiple round robin records listed for the hostname?

Re: DNS lookup failing even though resolv.conf is correct

Posted: Tue Dec 13, 2016 2:34 pm
by snapon_admin

Code: Select all

[root@lisl-ngos-01-pv etc]# nslookup lisgrid01p.snapon.com
Server:         10.245.70.12
Address:        10.245.70.12#53

Name:   lisgrid01p.snapon.com
Address: 10.245.64.47
But the trap still shows up in unconfigured objects list as the IP for the host.

Re: DNS lookup failing even though resolv.conf is correct

Posted: Tue Dec 13, 2016 2:53 pm
by avandemore
Unconfigured Objects has nothing to do with DNS resolution. It means Nagios received a passive check result for which there is no service setup for it to be assigned to.

For example:

Code: Select all

[root@avandemore-centos6 ~]# cat /root/test
bad host name   Passive Service 0       test message

[root@avandemore-centos6 ~]#
results in:
passive3.png
because I don't have a host called bad host name and Nagios doesn't know what to do with the result.

Re: DNS lookup failing even though resolv.conf is correct

Posted: Tue Dec 13, 2016 4:38 pm
by snapon_admin
I know that. But I HAVE a host in Nagios with the correct DNS name. When the trap is sent it shows up in /var/log/snmptt.log, but does not show up on the existing host in Nagios. Instead, it shows up as an unconfigured object because it's showing up as the IP instead of the host name.

Re: DNS lookup failing even though resolv.conf is correct

Posted: Tue Dec 13, 2016 5:04 pm
by avandemore
You can check /etc/nsswitch.conf to insure the correct search order is setup, but other than that the Nagios system is going to use whatever DNS sends back. If DNS is sending back flaky results, do either of these:
1. Add entry to /etc/hosts and ensure nsswitch.conf reads it first.
2. Fix your DNS.

Also you don't have a host called lisgrid01p.snapon.com which is how the resolver is seeing it. If you had its IP address in the address field instead of the FQDN do you get the expected behavior?

Re: DNS lookup failing even though resolv.conf is correct

Posted: Tue Dec 13, 2016 5:11 pm
by snapon_admin
I do have a host setup that is lisgrid01p. That's my point. This was working before I had to change the resolv.conf file so that the windows servers were listed first. When I did that traps started coming in as the IP. That is how they're coming in now. If I ad a host with the IP as the name, it works fine. That's my issue here, clearly the nagios server sees that IP as the right host, as shown in the nslookup, but it's still dumping my traps into unconfigured objects because it wants to use the IP instead. I've removed all but 2 DNS servers from the resolv.conf file and it's still having the same issue.

Re: DNS lookup failing even though resolv.conf is correct

Posted: Tue Dec 13, 2016 5:22 pm
by avandemore
Please reread my previous post carefully.
avandemore wrote:you don't have a host called lisgrid01p.snapon.com which is how the resolver is seeing it
I made this statement based upon the last profile you sent. It most certainly did not have a host called lisgrid01p.snapon.com. It did have one called lisgrid01p with its address set to the FQDN. Please follow the previous instructions and report the result.

Re: DNS lookup failing even though resolv.conf is correct

Posted: Tue Dec 13, 2016 5:46 pm
by snapon_admin
The .snapon.com is irrelevant I think. All of the other hosts that send traps drop the suffix anyway. I'm not sure what you want me to do with that nsswitch.conf file. I don't know what order you're referring to.

Re: DNS lookup failing even though resolv.conf is correct

Posted: Tue Dec 13, 2016 5:50 pm
by avandemore
It is not irrelevant, however I have also confirmed the host_name must match exactly and the address field will not match. So the test will fail. You need to change your host_name attribute to lisgrid01p.snapon.com or follow one of the other solutions I listed earlier.