check_cluster using Service Groups or another solution?
Posted: Tue Oct 16, 2018 3:10 pm
I already have many service groups defined for our environment.
Some of them include the two redundant trunks between redundant devices (GW1 & GW2 for example). With that redundancy, our environment is still up if only one trunk is down and thus should not affect our SLAs.
I would like a way to check the services defined in a service group and mark it WARNING if one is down and mark it CRITICAL if both are down.
I have spent the better part of today trying to see if I can pass check_cluster a service group or other parameters in order to have this OR relationship described for notifications.
Can you create a cluster from a service group? Is there another command that would be better? Should I create my own command to iterate through the members of a service group, pull their status, and alert appropriately?
It seems like with check_cluster, I would have to re-create all these groups, which I obviously would like to avoid. I think the functionality is there since the "Service Status Summary" for a service group already shows what the member service status is. I just need to be able to set another service status based on the status of those service members in relation to each other or notify based on the same.
This solution would be deployed in several NAGIOS instances, so I would like something that is easily to replicate.
Thanks In Advance!
Some of them include the two redundant trunks between redundant devices (GW1 & GW2 for example). With that redundancy, our environment is still up if only one trunk is down and thus should not affect our SLAs.
I would like a way to check the services defined in a service group and mark it WARNING if one is down and mark it CRITICAL if both are down.
I have spent the better part of today trying to see if I can pass check_cluster a service group or other parameters in order to have this OR relationship described for notifications.
Can you create a cluster from a service group? Is there another command that would be better? Should I create my own command to iterate through the members of a service group, pull their status, and alert appropriately?
It seems like with check_cluster, I would have to re-create all these groups, which I obviously would like to avoid. I think the functionality is there since the "Service Status Summary" for a service group already shows what the member service status is. I just need to be able to set another service status based on the status of those service members in relation to each other or notify based on the same.
This solution would be deployed in several NAGIOS instances, so I would like something that is easily to replicate.
Thanks In Advance!