I already have many service groups defined for our environment.
Some of them include the two redundant trunks between redundant devices (GW1 & GW2 for example). With that redundancy, our environment is still up if only one trunk is down and thus should not affect our SLAs.
I would like a way to check the services defined in a service group and mark it WARNING if one is down and mark it CRITICAL if both are down.
I have spent the better part of today trying to see if I can pass check_cluster a service group or other parameters in order to have this OR relationship described for notifications.
Can you create a cluster from a service group? Is there another command that would be better? Should I create my own command to iterate through the members of a service group, pull their status, and alert appropriately?
It seems like with check_cluster, I would have to re-create all these groups, which I obviously would like to avoid. I think the functionality is there since the "Service Status Summary" for a service group already shows what the member service status is. I just need to be able to set another service status based on the status of those service members in relation to each other or notify based on the same.
This solution would be deployed in several NAGIOS instances, so I would like something that is easily to replicate.
Thanks In Advance!
check_cluster using Service Groups or another solution?
Re: check_cluster using Service Groups or another solution?
An update in case anyone is ever looking for something similar....turns out BPI was what I was looking for. That name never really leapt out at me as a function to alert/notify. It sounded more like a static upper management reporting tool.
With your Service Groups created, you can go to the BPI dashboard, select Service Groups and then sync the service groups.
Once that is done, you can configure your service groups' warning and critical thresholds based on the percentage of the members of the service group that are up or down.
Then you can create a service that monitors the BPI status and notifies accordingly.
This doc set me on the right path:
https://assets.nagios.com/downloads/nag ... BPI_v2.pdf
With your Service Groups created, you can go to the BPI dashboard, select Service Groups and then sync the service groups.
Once that is done, you can configure your service groups' warning and critical thresholds based on the percentage of the members of the service group that are up or down.
Then you can create a service that monitors the BPI status and notifies accordingly.
This doc set me on the right path:
https://assets.nagios.com/downloads/nag ... BPI_v2.pdf
Re: check_cluster using Service Groups or another solution?
It seems like you found an answer to your question. You are correct - the BPI will do the job for you. Let us know if you have any further questions.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: check_cluster using Service Groups or another solution?
Also, is there any updates to internal feature request (TASK ID 5131) related forum post?
This would also be very helpful to me.
This would also be very helpful to me.
Re: check_cluster using Service Groups or another solution?
Unfortunately, task 5131 has been cancelled by our developers as there was not enough interest in this feature.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: check_cluster using Service Groups or another solution?
I was afraid of that.
Thanks for the response.
Thanks for the response.
Re: check_cluster using Service Groups or another solution?
I am closing this topic. If you have any further questions, please start a new thread. Thank you!
Be sure to check out our Knowledgebase for helpful articles and solutions!