Page 1 of 1

How to monitor SAN file systems after Head Node failure

Posted: Tue Mar 13, 2012 9:55 am
by petronagios
Hi Please could you help,

We monitor about 80 file systems that are shared out from a NAS. The NAS is an EMC celerra which has two Head Nodes, both Head Hodes are up on the network at the sametime but only one is the Primary and is used as the hostname in our Nagios service checks. Recently a software problem caused the Primary to failover to the second Head Node, because the file systems are not defined as a service on the second Head Node Nagios reports them all as critical, even though they were still available.

Can a service be defined to belong to two hosts? or, if at least one of the hosts is available the services stay in an OK state.

The Head Nodes have different IP address and there isn't an IP that fails over between them.

All help much appreciated!

Many thanks
Steve

Re: How to monitor SAN file systems after Head Node failure

Posted: Tue Mar 13, 2012 11:01 am
by mguthrie
Nagios BPI might be a handy tool for a situation like this. You can create a business process group, set rules to determine it's state, and then run checks against that group as a whole.
http://exchange.nagios.org/directory/Ad ... 29/details

Re: How to monitor SAN file systems after Head Node failure

Posted: Wed Mar 14, 2012 10:07 am
by petronagios
Thanks for the quick reply, I'll have a look at BPI.

Cheers
Steve.

Re: How to monitor SAN file systems after Head Node failure

Posted: Tue Feb 05, 2013 4:07 pm
by slansing
Closing and marking as resolved.