check_bind.sh pluguin

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
linuser
Posts: 102
Joined: Fri Sep 18, 2015 9:53 am

check_bind.sh pluguin

Post by linuser »

I am trying to get the check_bind.sh script to run on a DNS server runing RHEL server 6.7. This script can be found here:

https://exchange.nagios.org/directory/P ... sh/details

I am a bit confused on how this script should actually work. I created a nagios user on the DNS server and added nagios to the "named" group. I can run the script just fine and I see data being collected and dropped into the named.stats file. However, new data is only appended to the file when I launch the script manually, or through a cron job. My goal is to have this data added automatically. How do I accomplish this and what I need to know is:

1) Do I need to have the actual nagios server calling up the check_bind.sh script on the DNS server? If so, how do I set that call up?
2) Does the check_bind.sh script need to be installed on the nagios server and not the DNS server?

We have Nagios 3.5.1 also running on RHEL server 6.7.
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: check_bind.sh pluguin

Post by jdalrymple »

Based upon the plugin output example:

Code: Select all

user@host ~ $ ./check_bind.sh
Bind9 is running. 640 successfull requests, 0 referrals, 3 nxdomains since last check. | 'success'=640 'referral'=0 'nxrrset'=236 'nxdomain'=3 'recursion'=1 'failure'=0 'duplicate'=0 'dropped'=0 
I'd say you simply need to run the script remotely using nrpe.

https://exchange.nagios.org/directory/A ... or/details

Does that seem to not be the case to you?
linuser
Posts: 102
Joined: Fri Sep 18, 2015 9:53 am

Re: check_bind.sh pluguin

Post by linuser »

Yes, thanks for the tip. After working through a few hurdles with that, I can get it to work from the nagios server with nrpe.
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: check_bind.sh pluguin

Post by jdalrymple »

That's awesome. Did you get your perfdata to show?

If so share a screenshot and of course rate the plugin on the Exchange!

Is this thread OK to mark solved and lock?
linuser
Posts: 102
Joined: Fri Sep 18, 2015 9:53 am

Re: check_bind.sh pluguin

Post by linuser »

I may have spoke too soon :( Here is the problem now:

Nagios can poll the server, either through an object cfg file I have added or running the command from the command line. Either way I get results.

From the UI tool: (which polls every 90 seconds)

Bind9 is running. 0 successfull requests, 0 referrals, 0 nxdomains since last check.

From cmd line:

check_nrpe -n -H 192.168.x.x -p 5666 -c check_bind1
Bind9 is running. 0 successfull requests, 0 referrals, 0 nxdomains since last check. | 'success'=0 'referral'=0 'nxrrset'=0 'nxdomain'=0 'recursion'=0 'failure'=0 'duplicate'=0 'dropped'=0

So they are talking to each other to a certain extent. Problem is, this still does not append data to /tmp/named.stats.tmp and /var/named/data/named.stats.
linuser
Posts: 102
Joined: Fri Sep 18, 2015 9:53 am

Re: check_bind.sh pluguin

Post by linuser »

One more point of clarification:

When I manually run the check_bind.sh file on the remote client through CLI I get this:

Bind9 is running. 0 successfull requests, 0 referrals, 0 nxdomains since last check. | 'success'=0 'referral'=0 'nxrrset'=0 'nxdomain'=0 'recursion'=0 'failure'=0 'duplicate'=0 'dropped'=0

This is the same message as I get when I run the below command from the Nagios server itself.

/usr/lib64/nagios/plugins/check_nrpe -n -H 192.168.x.x -p 5666 -c check_bind1

Even though these results are the same whether or not I run from the local system or the remote server, when I do run from the local server it at least dumps stats into the named.stats file. It does not dump stats when I run the same command from the nagios server.

These are the stats I am trying to get back from the remote client and sent to the nagios server.
linuser
Posts: 102
Joined: Fri Sep 18, 2015 9:53 am

Re: check_bind.sh pluguin

Post by linuser »

UPDATE:

After putting the nagios user in a particular group on the remote side, things started to work. The stats are dumping per the following events:

1) Every five minutes at the check interval
2) If I run the script locally
3) If I run the script using nrpe_check manually from the nagios server. .

Now, that being said, I have 2 questions:

1) What controls this check interval? Is it RNDC on the remote side, or the Nagios server?

2) Even though the data is being dumped now automatically, the nagos server is still reporting "Bind9 is running. 0 successfull requests, 0 referrals, 0 nxdomains since last check.
Performance Data: 'success'=0 'referral'=0 'nxrrset'=0 'nxdomain'=0 'recursion'=0 'failure'=0 'duplicate'=0 'dropped'=0

I would think if the new data is being dumped, it should reflect in the above result, but it is not. Is there another piece to this I am missing?
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: check_bind.sh pluguin

Post by jdalrymple »

linuser wrote:1) What controls this check interval? Is it RNDC on the remote side, or the Nagios server?
service definition, check_interval and retry_interval. This could be defined in an upstream template. That is unless I'm somehow misunderstanding how this whole bit is working.
linuser wrote:2) Even though the data is being dumped now automatically, the nagos server is still reporting "Bind9 is running. 0 successfull requests, 0 referrals, 0 nxdomains since last check.
Performance Data: 'success'=0 'referral'=0 'nxrrset'=0 'nxdomain'=0 'recursion'=0 'failure'=0 'duplicate'=0 'dropped'=0
Does `rndc stats` indicate that there is anything going on?
linuser
Posts: 102
Joined: Fri Sep 18, 2015 9:53 am

Re: check_bind.sh pluguin

Post by linuser »

rndc stats dumps to /var/named/data/named.stats, so I can confirn thats working as expected. rndc status says:

version: 9.8.2rc1-RedHat-9.8.2-0.37.rc1.el6_7.4
CPUs found: 2
worker threads: 2
number of zones: 21
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is OFF
recursive clients: 0/0/1000
tcp clients: 0/100
server is up and running

In /var/log messages I can also see the scheduled dumps happening.

Sep 22 11:02:51 <host> named[6946]: received control channel command 'stats'
Sep 22 11:02:51 <host>l named[6946]: dumpstats complete

I just cant seem to figure out why all data reported shows no activity when I run the script, yet it dumps into the file. Keep in mind I can't even get activity to report when I run the script directly off the DNS server and not the nagios server. So I know its not a problem with the nagios server.
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: check_bind.sh pluguin

Post by jdalrymple »

What does the ++Name Server Statistices ++ section of your named.stats file look like?

Code: Select all

++ Name Server Statistics ++
                 368 IPv4 requests received
                   2 requests with EDNS(0) received
                   1 TCP requests received
                 354 responses sent
                   2 responses with EDNS(0) sent
                 327 queries resulted in successful answer
                  15 queries resulted in authoritative answer
                 332 queries resulted in non authoritative answer
                   2 queries resulted in referral answer
                   3 queries resulted in nxrrset
                   7 queries resulted in SERVFAIL
                  15 queries resulted in NXDOMAIN
                 309 queries caused recursion
                  14 duplicate queries received
                 367 UDP queries received
                   1 TCP queries received
Can the user running the nrpe daemon read that file?
Locked