Service check results different in GUI vs. from command line

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
rickwilson7425
Posts: 125
Joined: Tue Mar 18, 2014 3:20 pm

Service check results different in GUI vs. from command line

Post by rickwilson7425 »

I am working on consolidating a couple of Nagios 3.2 servers onto one 3.5 server. I have some DNS checks that use the check_cluster plug-in.

I get a good return of information, formatted properly, when I run the check from the command line as either root or nagios user.

When the check is run from within Nagios I get a critical error saying pretty much the opposite of what the command line says. This happens for a number of checks based off the check_cluster plug-in.

The checks work fine in the existing 3.2 servers.

Rick
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Service check results different in GUI vs. from command

Post by ssax »

Please post the exact command (sanitized of course) that you're running from the command line and the exact command and service definition from the non-working one so that we can try to spot any differences.
rickwilson7425
Posts: 125
Joined: Tue Mar 18, 2014 3:20 pm

Re: Service check results different in GUI vs. from command

Post by rickwilson7425 »

Here is the working command line:

Code: Select all

perl check_agregate -v mque -t "unsent=(#d+).#d+" -d $SERVICEPERFDATA:relay1.dcpn:mailq$,$SERVICEPERFDATA:relay2.dcpn:mailq$,$SERVICEPERFDATA:relay3.dcpn:mailq$,$SERVICEPERFDATA:relay4.dcpn:mailq$,$SERVICEPERFDATA:relay1.wcpn:mailq$,$SERVICEPERFDATA:relay2.wcpn:mailq$

mque OK - =:relay1.dcpn:mailq$= =:relay2.dcpn:mailq$= =:relay3.dcpn:mailq$= =:relay4.dcpn:mailq$= =:relay1.wcpn:mailq$= =:relay2.wcpn:mailq$= 
The command and service definitions are in the attachment, along with a clip of the statuses:
Attachments
DNS-Cluster.docx
(57.59 KiB) Downloaded 253 times
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Service check results different in GUI vs. from command

Post by tmcdonald »

Can you please post the command and service definition as requested by ssax? We don't know what sort of results you can get from using a third-party web fronted (is that just NagiosQL?) to build the configs, so it is best to see the final result directly. Chances are some of the characters in the arguments are causing problems.
Former Nagios employee
rickwilson7425
Posts: 125
Joined: Tue Mar 18, 2014 3:20 pm

Re: Service check results different in GUI vs. from command

Post by rickwilson7425 »

That is the Thruk interface for OMD -

Here is the service definition:

define service {
service_description DNS
host_name nsal1.dc,nsal2.dc,nsal1.wc,nsal2.wc
use gen-service
check_command check_dns!www.genesyslab.com!198.49.180.8
servicegroups DNS_ALU
}

Here is the command:

define command {
command_name check_service_cluster
command_line $USER1$/check_cluster -s -l $ARG1$ -w $ARG2$ -c $ARG3$ -d $ARG4$
}
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Service check results different in GUI vs. from command

Post by tmcdonald »

That doesn't look like the correct service file. That's using check_dns and in your example you were using check_aggregate. Can you post that definition?
Former Nagios employee
rickwilson7425
Posts: 125
Joined: Tue Mar 18, 2014 3:20 pm

Re: Service check results different in GUI vs. from command

Post by rickwilson7425 »

I'm sorry - going blind looking at all this stuff today. The agregate thing is one I fixed already - I got them confused.

**********************************************************************
The problem is with a DNS check, here are the service and command defs:

define service {
service_description DNS_ALU
host_name CLUSTER.HOLDER
use gen-service
check_command check_service_cluster!"DNS Cluster"!3!3!$SERVICESTATEID:nsal1.dc:DNS$,$SERVICESTATEID:nsal2.dc:DNS$,$SERVICESTATEID:nsal1.wc:DNS$,$SERVICESTATEID:nsal2.wc:DNS$
}

define command {
command_name check_service_cluster
command_line $USER1$/check_cluster -s -l $ARG1$ -w $ARG2$ -c $ARG3$ -d $ARG4$
}

*************************************************************

This is the result of running from command line:

./check_cluster -s -l "DNS Cluster" -w 3 -c 3 -d $SERVICESTATEID:nsal1.dc:DNS$,$SERVICESTATEID:nsal2.dc:DNS$,$SERVICESTATEID:nsal1.wc:DNS$,$SERVICESTATEID:nsal2.wc:DNS$

CLUSTER OK: DNS Cluster: 4 ok, 0 warning, 0 unknown, 0 critical

*************************************************************

This is what is showing in the Nagios GUI:

CLUSTER.HOLDER DNS_ALU CRITICAL 14:43:48 3d 4h 27m 20s 2/2 CLUSTER CRITICAL: DNS Cluster: 0 ok, 0 warning, 0 unknown, 4 critical

*************************************************************

The command line is showing 4 OK - the GUI is showing 4 CRITICAL
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Service check results different in GUI vs. from command

Post by Box293 »

rickwilson7425 wrote:This is the result of running from command line:

./check_cluster -s -l "DNS Cluster" -w 3 -c 3 -d $SERVICESTATEID:nsal1.dc:DNS$,$SERVICESTATEID:nsal2.dc:DNS$,$SERVICESTATEID:nsal1.wc:DNS$,$SERVICESTATEID:nsal2.wc:DNS$

CLUSTER OK: DNS Cluster: 4 ok, 0 warning, 0 unknown, 0 critical
I've not played with on demand macros, however from what I understand, you can't run a plugin at the command line that references macros as they will not be expanded to their true values. I suspect all of these are evaluating to 0 and hence why it runs OK from the command line:

Code: Select all

./check_cluster -s -l "DNS Cluster" -w 3 -c 3 -d 0,0,0,0

CLUSTER OK: DNS Cluster: 4 ok, 0 warning, 0 unknown, 0 critical
Can you confirm that each host nsal1.dc, nsal2.dc, nsal1.wc, nsal2.wc has a service called DNS. Do all of these services currently have an OK (0) state?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
rickwilson7425
Posts: 125
Joined: Tue Mar 18, 2014 3:20 pm

Re: Service check results different in GUI vs. from command

Post by rickwilson7425 »

Yes, the services are running fine. Here are the results from the old server (the same as from the command line on the new server):

CLUSTER.HOLDER DNS_ALU OK 2015-08-11 09:39:18 1736d 16h 19m 40s 1/2 CLUSTER OK: DNS Cluster: 4 ok, 0 warning, 0 unknown, 0 critical
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Service check results different in GUI vs. from command

Post by tmcdonald »

Can you run the check from the CLI as the nagios user on the 3.2 server and post the results? There is no way the on-demand macros can be working when run from the CLI manually.
Former Nagios employee
Locked