problem with check_dns

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Post Reply
vinmansbrew
Posts: 26
Joined: Wed Jan 22, 2020 5:26 pm

problem with check_dns

Post by vinmansbrew »

So, a bit of background. My nagios core, working on upgrading from rhel 6.10. nagios core 4.4.6.
I have nagios configured to check dns on a dns master server. It checks the named files. These 2 files are modified, which then propagates down to the dns slaves.
The dns master server was rhel 6.10. I upgraded it to rhel 8.9, which went fine for everything but 1 thing.
All the nagios checks pass, except for the check of these named files.

Code: Select all

define service{
        use linux-generic-service
        host_name nagios
        servicegroups linux-dns-services
        service_description DNS cord.edu
        check_command check_dns_zone!brogurt.com
        check_interval 5
}
This is the config, which is stored on the nagios core server, which has not changed. dns goes through port 53, which is open for tcp and udp, as it was before.

ALL other checks work fine, from the client. This check is performed by the nagios core server. They are on different subnets, but they have always been.
I imagine it's probably something simple, I just haven't found it.

Thanks

Edit, I had reverted back, since this is a virtual server.
I updated php from a really dead version, 5.3, to slightly less really dead version, 7.3.25. I didn't want to jump too far.
So, it still worked after that.
Then I upgraded from core 4.4.5, to core 4.5.0.
So, these changes have not mattered. Just wanted to keep this current. I didn't think a php or core change would be the issue, however, since it started when the dns server was upgraded.
The plugins and nrpe are updated, too. nrpe is 4.1, and the plugins are 2.4.7.

If it helps, eth0 has the IP, netmask, and gateway specified. NM controlled is yes. DNS servers are not specified. peerdns = yes, ipv6 of, bootproto=none This is on the nagios core server.

The master dns server has IP and netmask set, no gateway, no dns servers set, nm controlled = no, bootproto static. not much else is set.
I will say that I can curl connect back and forth between the 2 servers. All other nrpe checks work. Just the check_dns does not work to the Named config files.
The master/slaves are working, changes made to master do propagate. permissions are 444, owned by root. I use RCS to keep track of revisions.
Last edited by vinmansbrew on Thu Feb 15, 2024 11:53 am, edited 1 time in total.
jsimon
Posts: 104
Joined: Wed Aug 23, 2023 11:27 am

Re: problem with check_dns

Post by jsimon »

Hi @vinmansbrew,

My first thought reading through this is that you may need to check the permissions for the user that is running the NRPE process. Since you upgraded your OS, this user may not have file permissions for the file that you're checking.

If this isn't the issue, could you provide the check_dns_zone command (with sanitized values per your discretion), as well as the response you're getting back from the check command? This would be helpful for troubleshooting further.
vinmansbrew
Posts: 26
Joined: Wed Jan 22, 2020 5:26 pm

Re: problem with check_dns

Post by vinmansbrew »

Just so it's here: this is the contents of check_dns_zone. I had to change the server names, to appease my security overloards.
#!/bin/bash

slave01=$(/usr/bin/dig @slave01.domain.com $1 SOA +short +notcp +time=1 +tries=1 +retry=1 | /bin/cut -d' ' -f3)
slave02=$(/usr/bin/dig @slave02.domain.com $1 SOA +short +notcp +time=1 +tries=1 +retry=1 | /bin/cut -d' ' -f3)
master=$(/usr/bin/dig @master.domain.com $1 SOA +short +notcp +time=1 +tries=1 +retry=1 | /bin/cut -d' ' -f3)

case $ns01 in
''|*[!0-9]*)
echo "DNS CRITICAL - slave01 is unreachable"
exit 1
;;
esac

case $ns02 in
''|*[!0-9]*)
echo "DNS CRITICAL - slave02 is unreachable"
exit 1
;;
esac

case $master in
''|*[!0-9]*)
echo "DNS CRITICAL - master is unreachable"
exit 1
;;
esac

if [ "$slave01" -ne "$slave02" ] 2>/dev/null; then
echo "DNS CRITICAL - slave01 and slave02 are not in sync"
exit 1
fi

if [ "$slave02" -ne "$master" ] 2>/dev/null; then
echo "DNS CRITICAL - slave02 and master are not in sync"
exit 1
fi

if [ "$master" -ne "$slave01" ] 2>/dev/null; then
echo "DNS CRITICAL - master and slave01 are not in sync"
exit 1
fi

echo "DNS OK"
exit 0

However, the error is "DNS critical - Master is unreachable"

So, it makes me wonder what got blocked. Iptables didn't change when the master was upgraded.

I did just try a dig command, a very simple 1.
"dig master.domain.com namedfile.com"

It appears to have worked.

;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 6379
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
jsimon
Posts: 104
Joined: Wed Aug 23, 2023 11:27 am

Re: problem with check_dns

Post by jsimon »

When you say the script appeared to have worked, what user were you running it as? If you haven't already I would recommend doing the following:

Code: Select all

su nagios
dig master.domain.com namedfile.com
We need to be sure the nagios user can run this, if you ran it successfully as root that could be a discrepancy.
vinmansbrew
Posts: 26
Joined: Wed Jan 22, 2020 5:26 pm

Re: problem with check_dns

Post by vinmansbrew »

Yes, I can run the dig command as the nagios user.
I get a successful answer.
talkexisting
Posts: 2
Joined: Tue Mar 19, 2024 3:23 am

Re: problem with check_dns

Post by talkexisting »

jsimon wrote: Fri Feb 16, 2024 10:29 am When you say the script appeared to have worked, what user were you running it as? If you haven't already I would recommend doing the following: slice master

Code: Select all

su nagios
dig master.domain.com namedfile.com
We need to be sure the nagios user can run this, if you ran it successfully as root that could be a discrepancy.
Hi @jsimon I'm having the same problem. Luckily everything was normal when running this command.
Post Reply