Anyone using Check_Cluster?

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
jtata
Posts: 47
Joined: Thu Sep 02, 2010 12:27 pm

Anyone using Check_Cluster?

Post by jtata »

I've been using the check_cluster plugin to monitor aggregate site availability for a while now. My monitoring is set up like this:

Host - mywebsite.com
> ServiceA - check_http > mywebsite
> ServiceB - check_http > mywebsite (via NRPE on machine in different location)
>ServiceC - check_cluster ServiceA and ServiceB.

This has been working flawlessly for a couple months. If website is down from either location I get a WARNING from check_cluster, from both locations CRITICAL. However over the weekend I got some really odd results coming in. The availability reports for that 24 hour period show ~99% uptime for A and B, but around 50% availability for the C check, across all hosts. It looks like we experienced some downtime, but the cluster check never realized the two component checks recovered.

Any advice what could be going on here?


Required Info:
NagiosXI 2009R1.3G on VM app.
rdedon
Posts: 578
Joined: Sat Nov 20, 2010 4:51 pm

Re: Anyone using Check_Cluster?

Post by rdedon »

Hello,
had anything changed prior to the odd results over the weekend?

Thank you.
Rene deDon
Technical Team
___
Nagios Enterprises, LLC
Web: http://www.nagios.com
jtata
Posts: 47
Joined: Thu Sep 02, 2010 12:27 pm

Re: Anyone using Check_Cluster?

Post by jtata »

Nothing I'm aware of. Just to be sure I went through the availability reports during out last downtime and didn't see any discrepancy between the site checks and the cluster check.
rdedon
Posts: 578
Joined: Sat Nov 20, 2010 4:51 pm

Re: Anyone using Check_Cluster?

Post by rdedon »

Hmm, could you try restarting and see if it is monitoring normally after?
Rene deDon
Technical Team
___
Nagios Enterprises, LLC
Web: http://www.nagios.com
jtata
Posts: 47
Joined: Thu Sep 02, 2010 12:27 pm

Re: Anyone using Check_Cluster?

Post by jtata »

My cluster checks were already reporting the correct state, they just took an entire day to recover after the individual services did. I did restart but not sure how to test if its working without causing something to go down.
rdedon
Posts: 578
Joined: Sat Nov 20, 2010 4:51 pm

Re: Anyone using Check_Cluster?

Post by rdedon »

Could you give this a read and see if it is applicable:
http://community.nagios.org/2009/06/18/ ... -ok-state/?
Rene deDon
Technical Team
___
Nagios Enterprises, LLC
Web: http://www.nagios.com
Locked