I Would line to monitor when windows cluster fails over. would I be able to use check_cluster plugin.
If so I have 1 cluster with two nodes behind it how would I go about doing it. the Instructions are not clear to me.
Windows Cluser
Re: Windows Cluser
1. Make sure that you are monitoring the services (PING in this example) on all servers (you can disable notifications for them, this is important so you don't get notifications when they are down), these service checks are what will be used by the check_cluster plugin and need to exist.
2. Create a new command:
- Command Name: check_service_cluster
- Command Line: $USER1$/check_cluster --service -l $ARG1$ -w $ARG2$ -c $ARG3$ -d '$ARG4$'
- Command Type: check command
3. Create the service cluster check:
- Description: PING_Cluster
- Check command: check_service_cluster
- $ARG1$: PING_Cluster
- $ARG2$: 3 <- Set this to one MORE than your total number of services (2 services + 1 = 3) - We don't care about warnings in this example
- $ARG3$: 1 <- Set this to one LESS than your total number of services (2 services - 1 = 1)
- $ARG4$: $SERVICESTATEID:yourhost1:PING$,$SERVICESTATEID:yourhost2:PING$
NOTE: The hostname and the service description in $ARG4$ need to be exact (case sensitive).
The way this would work is that whenever that service is not running on ANY of the nodes it would generate a CRITICAL. So the check_cluster uses the statuses of all of each individual service checks to determine if there is an issue and since you disabled the notifications on the individual services you won't get those notifications, this is the service that will do the notifying.
Please read here for more information:
https://assets.nagios.com/downloads/nag ... sters.html
Here are some other Microsoft Cluster plugins that I found as well:
https://exchange.nagios.org/index.php?o ... %20cluster
2. Create a new command:
- Command Name: check_service_cluster
- Command Line: $USER1$/check_cluster --service -l $ARG1$ -w $ARG2$ -c $ARG3$ -d '$ARG4$'
- Command Type: check command
3. Create the service cluster check:
- Description: PING_Cluster
- Check command: check_service_cluster
- $ARG1$: PING_Cluster
- $ARG2$: 3 <- Set this to one MORE than your total number of services (2 services + 1 = 3) - We don't care about warnings in this example
- $ARG3$: 1 <- Set this to one LESS than your total number of services (2 services - 1 = 1)
- $ARG4$: $SERVICESTATEID:yourhost1:PING$,$SERVICESTATEID:yourhost2:PING$
NOTE: The hostname and the service description in $ARG4$ need to be exact (case sensitive).
The way this would work is that whenever that service is not running on ANY of the nodes it would generate a CRITICAL. So the check_cluster uses the statuses of all of each individual service checks to determine if there is an issue and since you disabled the notifications on the individual services you won't get those notifications, this is the service that will do the notifying.
Please read here for more information:
https://assets.nagios.com/downloads/nag ... sters.html
Here are some other Microsoft Cluster plugins that I found as well:
https://exchange.nagios.org/index.php?o ... %20cluster
Re: Windows Cluser
I get an error
Error: Service has no hosts and/or service_description (config file '/usr/local/nagios/etc/services/Sites Scope Cluster Check.cfg', starting on line 16)
here is what I put
$ARG1$ CLUSTER_CHECK
$ARG2$ 2
$ARG3$ 1
$ARG4$ 418:uhsismoncprb1.umhs.med.umich.edu:SiteScope$,419:uhsismoncprb1.umhs.med.umich.edu:SiteScope$
I also induced a picture
Error: Service has no hosts and/or service_description (config file '/usr/local/nagios/etc/services/Sites Scope Cluster Check.cfg', starting on line 16)
here is what I put
$ARG1$ CLUSTER_CHECK
$ARG2$ 2
$ARG3$ 1
$ARG4$ 418:uhsismoncprb1.umhs.med.umich.edu:SiteScope$,419:uhsismoncprb1.umhs.med.umich.edu:SiteScope$
I also induced a picture
You do not have the required permissions to view the files attached to this post.
Re: Windows Cluser
That looks proper.
Please go to Configure > Core Config Manager > Tools > Config File Management:
- Click the Delete Files button (don't worry, it's safe, they will be rewritten)
- Then click the Write Configs button
- Then click the Verify Files button, if it verifies properly, please try to Apply Configuration
That alone may fix it.
If that doesn't fix it, do this in these exact steps to get the files in the bad state:
Go to Configure > Core Config Manager > Tools > Config File Management:
- Click the Delete Files button (don't worry, it's safe, they will be rewritten)
- Then click the Write Configs button
- Once they are done being written (don't do anything in between, don't apply config or anything), run this command and PM (private message) me the resulting /tmp/NAGIOSBADFILES.zip file:
Please PM me a copy of your profile as well, you can download it from Admin > System Profile by clicking the Download Profile button.
Please go to Configure > Core Config Manager > Tools > Config File Management:
- Click the Delete Files button (don't worry, it's safe, they will be rewritten)
- Then click the Write Configs button
- Then click the Verify Files button, if it verifies properly, please try to Apply Configuration
That alone may fix it.
If that doesn't fix it, do this in these exact steps to get the files in the bad state:
Go to Configure > Core Config Manager > Tools > Config File Management:
- Click the Delete Files button (don't worry, it's safe, they will be rewritten)
- Then click the Write Configs button
- Once they are done being written (don't do anything in between, don't apply config or anything), run this command and PM (private message) me the resulting /tmp/NAGIOSBADFILES.zip file:
Code: Select all
zip -r /tmp/NAGIOSBADFILES.zip /usr/local/nagios/etcRe: Windows Cluser
You still need to attach the service to a host whether you create a new host that points at the VIP or use an existing host.
Go to Configure > Core Config Manager > Services:
- Edit the cluster service you created
- Click Manage Hosts and assign it to a host
- Save and apply config
Go to Configure > Core Config Manager > Services:
- Edit the cluster service you created
- Click Manage Hosts and assign it to a host
- Save and apply config