Page 1 of 1

check_dig: No ANSWER SECTION found

Posted: Mon Aug 13, 2018 5:57 pm
by yomiko
My distributed monitoring setup:
master: nagios core, nsca
slave : nagios core, send_nsca

On the slave, I was able to run the check_dig command manually
# /usr/lib64/nagios/plugins/check_dig -H <host_ip> -l http://www.google.com
DNS OK - 0.017 seconds response time (http://www.google.com. 243 IN A <ip>)|time=0.017362s;;;0.000000
However, the log is showing
[1534197576] SERVICE ALERT: <host>;DNS resolution;CRITICAL;HARD;5;DNS CRITICAL - 0.017 seconds response time (No ANSWER SECTION found)
With this, send_nsca forwards the same "NO ANSWER SECTION found" result to the master.

My host/service cfg:

Code: Select all

    define host {
            use                             linux-server
            host_name                       <host>
            address                         <host_ip>
            max_check_attempts              5
            check_period                    24x7
            notification_interval           30
            notification_period             24x7
            }
    
    define service {
            use                             generic-service         
            host_name                       <host>
            service_description             DNS resolution
            check_command                   check_dig!<host>!www.apple.com
            }
My commands.cfg

Code: Select all

define command{
        command_name    check_dig
        command_line    $USER1$/check_dig -H $HOSTADDRESS$ -l $ARG1$
        }
# $USER1$ is /usr/lib64/nagios/plugins


I want to say it used to work. Anything I might miss?

Thanks!

Re: check_dig: No ANSWER SECTION found

Posted: Tue Aug 14, 2018 11:27 am
by cdienger
No ANSWER SECTION found usually means there wasn't a valid response sent to the query. Try running the plugin with the -v option for more verbose output that may help identify the problem. A tcpdump may also help:

yum -y install tcpdump
tcpdump -s 0 -i any port 53 -w output.pcap

Let it run just long enough to capture the behavior(see the message logged) and then use CTRL+C to stop it and use wireshark to review output.pcap