Page 1 of 1

check_by_ssh and timeouts

Posted: Tue Jul 24, 2018 11:37 am
by onegative
Before I dig into a deep dive trying to find a solution I was hoping someone might already have some type of solution available.
My environment periodically has network issues which results in event storms when check_by_ssh timeouts occur. Even though I have it set to 30 seconds it results in "CRITICAL - Plugin timed out after 30 seconds" notifications after multiple failures. I thought about using a wrapper script to test the results and set a specific Status and Exit but thought perhaps there was a way the C source check_by_ssh.c that might be capable of the same behavior. Perhaps a way to make it result in a Warning instead.

Has anyone done any work with the C source to allow for the modification of behavior for timeouts? Maybe using the extra options???

Any help would be greatly appreciated...
Danny

Re: check_by_ssh and timeouts

Posted: Tue Jul 24, 2018 12:49 pm
by mcapra
If you were looking for pointers on how to do this yourself, a simple set_timeout_state(STATE_UNKNOWN) in check_by_ssh and a rebuild would probably do the trick.

I think CRITICAL is a good general purpose state for that particular plugin's timeout, but your use case makes sense.

Re: check_by_ssh and timeouts

Posted: Tue Jul 24, 2018 3:30 pm
by scottwilkerson
Not sure when it was added but in 2.2.1 of the plugin you can add a timeout state

Code: Select all

 -t, --timeout=INTEGER:<timeout state>
    Seconds before connection times out (default: 10)
    Optional ":<timeout state>" can be a state integer (0,1,2,3) or a state STRING

Re: check_by_ssh and timeouts

Posted: Tue Jul 24, 2018 3:34 pm
by onegative
This helped...I was able to modify utils.c as follows hard coding the exit (1);

Code: Select all

void
timeout_alarm_handler (int signo)
{
        const char msg[] = " - Plugin timed out\n";
        if (signo == SIGALRM) {
/*              printf (_("%s - Plugin timed out after %d seconds\n"),
                                                state_text(timeout_state), timeout_interval); */
                switch(timeout_state) {
                        case STATE_OK:
                                write(STDOUT_FILENO, "OK", 2);
                                break;
                        case STATE_WARNING:
                                write(STDOUT_FILENO, "WARNING", 7);
                                break;
                        case STATE_CRITICAL:
                                write(STDOUT_FILENO, "CRITICAL", 8);
                                break;
                        case STATE_DEPENDENT:
                                write(STDOUT_FILENO, "DEPENDENT", 9);
                                break;
                        default:
                                write(STDOUT_FILENO, "UNKNOWN", 7);
                                break;
                }
                write(STDOUT_FILENO, msg, sizeof(msg) - 1);
                exit (1);
        }
}
Which resulted in the timeout still displaying CRITICAL but it exit results as 1 which triggers Warning response within the Nagios framework....I think this will serve what I need...

Thanks for your input...
Danny

Re: check_by_ssh and timeouts

Posted: Tue Jul 24, 2018 3:35 pm
by onegative
Wow I did not see the timeout state so I will give that a go...THANKS!!!!

Re: check_by_ssh and timeouts

Posted: Tue Jul 24, 2018 3:43 pm
by onegative
Yeppers works like a CHARM!!!!!!!! Thanks for your help...and that makes the timeout usable for my environment.

Danny

Re: check_by_ssh and timeouts

Posted: Wed Jul 25, 2018 8:27 am
by scottwilkerson
onegative wrote:Yeppers works like a CHARM!!!!!!!! Thanks for your help...and that makes the timeout usable for my environment.

Danny
Excellent!

Locking