Return code for failing check

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
KurtAlden
Posts: 5
Joined: Thu Oct 31, 2013 9:59 am

Return code for failing check

Post by KurtAlden »

Hello

I am developing a Nagios plugin but I am unsure about something, I've had a look at the documentation but can't find a satisfactory answer. If my plugin goes to check something and the check fails not because the result is over a certain threshold but because a timeout occurs (ie, there is a transient network issue) what result should my plugin return? Returning a critical error is not necessarily valid as the result of that check is not critical and I would not want to attempt to fire an event handler based on that. If the host is down that'll be picked up by a different check. Should I return unknown?

Thanks in advance
K
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Return code for failing check

Post by ssax »

I would use return code 3 for unknown and then your event handler would check the state and ignore or execute based on the state.
KurtAlden
Posts: 5
Joined: Thu Oct 31, 2013 9:59 am

Re: Return code for failing check

Post by KurtAlden »

Hello

Thanks for your response. That definitely sounds like the right option to me however the plugin developer guidelines say the following...

"Higher-level errors (such as name resolution errors, socket timeouts, etc) are outside of the control of plugins and should generally NOT be reported as UNKNOWN states."

It does not actually say what they should be reported as. Not 1 on 2 surely as this is not correct so what then, 0?
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: Return code for failing check

Post by jdalrymple »

Obviously you found the guidelines:

https://nagios-plugins.org/doc/guidelines.html

The only way I can interpret that as being reasonably "accurate" would be that you're assumed to be checking all of your higher level services and have service dependencies in place to prevent such an occurrence. That said - use your best judgement. These are guidelines, not rules.
Locked