Set Nagios Service to Pending State or Change color code

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
vlakshman
Posts: 27
Joined: Tue Aug 21, 2018 11:03 am

Set Nagios Service to Pending State or Change color code

Post by vlakshman »

Team,

I have a requirement that based on some condition, my service plugin has to set the return code of service check to "Pending".
I don't see any color code or option available to do this.

System Architecture:
I have a service running on 2 servers acting as master and slave and switching between these host/services happen real time.
Currently I have defined Host and Service definitions for both Hosts and Services.

My custom plugin:
1) Sets the Active Host to UP and Inactive Host is also set to UP but message as "Inactive Host"
2) Sets Active Service to OK and Inactive Service to WARN

The downside of this implementation is the WARN Services which basically mean that this service needs monitoring (Which can confuse client as its the expected state a slave service will be in).

Is there a way to display only the active Host and service? or change the color code of Inactive Host/Service on the fly?
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Set Nagios Service to Pending State or Change color code

Post by ssax »

You cannot set the state to PENDING, that state is only when no results have been received. The colors are also based on the plugin exit code and there is not a way to alter that functionality other than changing you plugin exit code.

You could use the check_cluster plugin but the individual service states are still going to show warning/critical unless for the status/color.

If you'd like to try that out, here are the instructions:

1. Make sure that you are monitoring the services (PING in this example) on all servers (you can disable notifications for them, this is important so you don't notifications when they are down), these service checks are what will be used by the check_cluster plugin and need to exist.

2. Create a new command:
- Command Name: check_service_cluster
- Command Line: $USER1$/check_cluster --service -l $ARG1$ -w $ARG2$ -c $ARG3$ -d '$ARG4$'
- Command Type: check command

3. Create the service cluster check:
- Description: PING_Cluster
- Check command: check_service_cluster
- $ARG1$: PING_Cluster
- $ARG2$: 4 <- Set this to one MORE than your total number of services (3 services + 1 = 4) - We don't care about warnings in this example
- $ARG3$: 2 <- Set this to one LESS than your total number of services (3 services - 1 = 2)
- $ARG4$: $SERVICESTATEID:yourhost1:PING$,$SERVICESTATEID:yourhost2:PING$,$SERVICESTATEID:yourhost3:PING$

NOTE: The hostname and the service description in $ARG4$ need to be exact (case sensitive).

The way this would work is that whenever that service is not running on ANY of the nodes it would generate a CRITICAL. So the check_cluster uses the statuses of all of each individual service checks to determine if there is an issue and since you disabled the notifications on the individual services you won't get those notifications, this is the service that will do the notifying.

Please read here for more information:

https://assets.nagios.com/downloads/nag ... sters.html

That way you could not even add those contacts on the individual services (so the customers wouldn't see it unless they are an admin), and only add them to the check_cluster service and that would get around the limitation but that only works if they are not Admin.
Locked