Hello
I am developing a Nagios plugin but I am unsure about something, I've had a look at the documentation but can't find a satisfactory answer. If my plugin goes to check something and the check fails not because the result is over a certain threshold but because a timeout occurs (ie, there is a transient network issue) what result should my plugin return? Returning a critical error is not necessarily valid as the result of that check is not critical and I would not want to attempt to fire an event handler based on that. If the host is down that'll be picked up by a different check. Should I return unknown?
Thanks in advance
K
Return code for failing check
Re: Return code for failing check
I would use return code 3 for unknown and then your event handler would check the state and ignore or execute based on the state.
Re: Return code for failing check
Hello
Thanks for your response. That definitely sounds like the right option to me however the plugin developer guidelines say the following...
"Higher-level errors (such as name resolution errors, socket timeouts, etc) are outside of the control of plugins and should generally NOT be reported as UNKNOWN states."
It does not actually say what they should be reported as. Not 1 on 2 surely as this is not correct so what then, 0?
Thanks for your response. That definitely sounds like the right option to me however the plugin developer guidelines say the following...
"Higher-level errors (such as name resolution errors, socket timeouts, etc) are outside of the control of plugins and should generally NOT be reported as UNKNOWN states."
It does not actually say what they should be reported as. Not 1 on 2 surely as this is not correct so what then, 0?
-
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: Return code for failing check
Obviously you found the guidelines:
https://nagios-plugins.org/doc/guidelines.html
The only way I can interpret that as being reasonably "accurate" would be that you're assumed to be checking all of your higher level services and have service dependencies in place to prevent such an occurrence. That said - use your best judgement. These are guidelines, not rules.
https://nagios-plugins.org/doc/guidelines.html
The only way I can interpret that as being reasonably "accurate" would be that you're assumed to be checking all of your higher level services and have service dependencies in place to prevent such an occurrence. That said - use your best judgement. These are guidelines, not rules.