Hi,
I have a 3 servers in a cluster that has the same service being monitored on all 3 servers but the service is running only on 1 server at a time.
I'm looking for a plugin that can monitor those services and will alert only if the service is not running at all.
The minimum and maximum is 1 service on 1 server( doesn't matter on which server, the service is running).
I looked at check_cluster plugin but it not eliminates the alerts from the services that are down.
Will appreciate your help.
Thanks,
Amit
Monitor service cluster
Re: Monitor service cluster
Are you actually looking to disable alerts for these services in a cluster or notifications? Alerts are what you see in the web UI under Reports > Alerts and in nagios.log. Notifications are the actual email or sms messages that get sent to people. I don't think there's anyway to stop alerts but notifications should be configurable.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Monitor service cluster
Hi,
I'm using "Dependencies" for cluster with 2 services.
Clusters with 3 services and above, is not supportable with "Dependencies"
the "Check_Cluster" plugin is actually looking on services that are being checked on Nagios.
I'm looking for a way to check my 3 services cluster in 1 single check.
Is there a way to achieve it?
Thanks
I'm using "Dependencies" for cluster with 2 services.
Clusters with 3 services and above, is not supportable with "Dependencies"
the "Check_Cluster" plugin is actually looking on services that are being checked on Nagios.
I'm looking for a way to check my 3 services cluster in 1 single check.
Is there a way to achieve it?
Thanks
Re: Monitor service cluster
You have two options here:
1. Write your own plugin that runs all 3 checks in a single check and outputs the correct exit code / output that you want.
If you decide to go this route you can read here for more information:
https://assets.nagios.com/downloads/nag ... inapi.html
And here:
https://nagios-plugins.org/doc/guidelines.html
OR
You can use the check_cluster plugin, let me show you how:
1. Make sure that you are monitoring the services (PING in this example) on all servers (you can disable notifications for them, this is important so you don't notifications when they are down), these service checks are what will be used by the check_cluster plugin and need to exist.
2. Create a new command:
- Command Name: check_service_cluster
- Command Line: $USER1$/check_cluster --service -l $ARG1$ -w $ARG2$ -c $ARG3$ -d '$ARG4$'
- Command Type: check command
3. Create the service cluster check:
- Description: PING_Cluster
- Check command: check_service_cluster
- $ARG1$: PING_Cluster
- $ARG2$: 4 <- Set this to one MORE than your total number of services (3 services + 1 = 4) - We don't care about warnings in this example
- $ARG3$: 2 <- Set this to one LESS than your total number of services (3 services - 1 = 2)
- $ARG4$: $SERVICESTATEID:yourhost1:PING$,$SERVICESTATEID:yourhost2:PING$,$SERVICESTATEID:yourhost3:PING$
NOTE: The hostname and the service description in $ARG4$ need to be exact (case sensitive).
The way this would work is that whenever that service is not running on ANY of the nodes it would generate a CRITICAL. So the check_cluster uses the statuses of all of each individual service checks to determine if there is an issue and since you disabled the notifications on the individual services you won't get those notifications, this is the service that will do the notifying.
Please read here for more information:
https://assets.nagios.com/downloads/nag ... sters.html
1. Write your own plugin that runs all 3 checks in a single check and outputs the correct exit code / output that you want.
If you decide to go this route you can read here for more information:
https://assets.nagios.com/downloads/nag ... inapi.html
And here:
https://nagios-plugins.org/doc/guidelines.html
OR
You can use the check_cluster plugin, let me show you how:
1. Make sure that you are monitoring the services (PING in this example) on all servers (you can disable notifications for them, this is important so you don't notifications when they are down), these service checks are what will be used by the check_cluster plugin and need to exist.
2. Create a new command:
- Command Name: check_service_cluster
- Command Line: $USER1$/check_cluster --service -l $ARG1$ -w $ARG2$ -c $ARG3$ -d '$ARG4$'
- Command Type: check command
3. Create the service cluster check:
- Description: PING_Cluster
- Check command: check_service_cluster
- $ARG1$: PING_Cluster
- $ARG2$: 4 <- Set this to one MORE than your total number of services (3 services + 1 = 4) - We don't care about warnings in this example
- $ARG3$: 2 <- Set this to one LESS than your total number of services (3 services - 1 = 2)
- $ARG4$: $SERVICESTATEID:yourhost1:PING$,$SERVICESTATEID:yourhost2:PING$,$SERVICESTATEID:yourhost3:PING$
NOTE: The hostname and the service description in $ARG4$ need to be exact (case sensitive).
The way this would work is that whenever that service is not running on ANY of the nodes it would generate a CRITICAL. So the check_cluster uses the statuses of all of each individual service checks to determine if there is an issue and since you disabled the notifications on the individual services you won't get those notifications, this is the service that will do the notifying.
Please read here for more information:
https://assets.nagios.com/downloads/nag ... sters.html