Page 1 of 1

How to use check_cluster?

Posted: Thu Sep 22, 2011 8:22 am
by TSCAdmin
Hello,

I configured xinetd service on a two node test cluster and stopped the service on NODE1, when I run check_cluster command I always seem to get OK, following are the test results:

Code: Select all

$ ./check_cluster -l "xinet Service" -d "$SERVICESTATEID:NODE1:xinet","$SERVICESTATEID:NODE2:xinet" -c0
CLUSTER OK: xinet Service: 2 ok, 0 warning, 0 unknown, 0 critical

$ ./check_cluster -l "xinet Service" -d "$SERVICESTATEID:NODE1:xinet","$SERVICESTATEID:NODE2:xinet" -c1
CLUSTER OK: xinet Service: 2 ok, 0 warning, 0 unknown, 0 critical

$ ./check_cluster -l "xinet Service" -d "$SERVICESTATEID:NODE1:xinet","$SERVICESTATEID:NODE2:xinet" -c2
CLUSTER OK: xinet Service: 2 ok, 0 warning, 0 unknown, 0 critical
Could you please suggest how do I use check_cluster in the following scenarios?

1. OK - if service X is running on node A and node B
2. OK - if service X is running on node A but not on node B or vice versa
3. WARNING - if service X is running on node A but not on node B or vice versa
4. CRITICAL - if service X is not running on either of the hosts

Thanks in advance

Re: How to use check_cluster?

Posted: Thu Sep 22, 2011 12:59 pm
by mguthrie
I'm not terribly familiar with that plugin, but it looks like you can set warning and critical thresholds on that plugin, could that be the issue?

Code: Select all

[root@localhost libexec]# ./check_cluster --help
check_cluster v1991 (nagios-plugins 1.4.13)
Copyright (c) 2000-2004 Ethan Galstad ([email protected])
Copyright (c) 2000-2007 Nagios Plugin Development Team
        <[email protected]>

Host/Service Cluster Plugin for Nagios 2

Usage: check_cluster (-s | -h) -d val1[,val2,...,valn] [-l label]
[-w threshold] [-c threshold] [-v] [--help]

Options:
 -s, --service
    Check service cluster status
 -h, --host
    Check host cluster status
 -l, --label=STRING
    Optional prepended text output (i.e. "Host cluster")
 -w, --warning=THRESHOLD
    Specifies the range of hosts or services in cluster that must be in a
    non-OK state in order to return a WARNING status level
 -c, --critical=THRESHOLD
    Specifies the range of hosts or services in cluster that must be in a
    non-OK state in order to return a CRITICAL status level
 -d, --data=LIST
    The status codes of the hosts or services in the cluster, separated by
    commas
 -v, --verbose
    Show details for command-line debugging (Nagios may truncate output)

Re: How to use check_cluster?

Posted: Fri Sep 23, 2011 1:45 am
by TSCAdmin
I tried setting different threshold values for Critical but it doesn't seem to honour them. Unfortunately, there aren't many good examples for check_cluster on the Internet and Nagios' user mailing list. Is there anyway we can get the support for this?

While we are at it could someone please share how do you go about monitoring High availability cluster services?

Thanks

Re: How to use check_cluster?

Posted: Fri Sep 23, 2011 12:26 pm
by mguthrie
I would actually try the nagios users mailing list on source forge for suggestions on the high availability monitoring options. You'll probably get better suggestions from people in the field than what we could offer.

Here's what I turned up on the Nagios exchange.
http://exchange.nagios.org/index.php?op ... rd=cluster
http://exchange.nagios.org/directory/Pl ... ailability
http://exchange.nagios.org/directory/Ad ... ailability