check_ntp_time Modification Question

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
PhoneSuite
Posts: 6
Joined: Mon May 04, 2015 5:44 pm

check_ntp_time Modification Question

Post by PhoneSuite »

Hello All!

I need to modify the check_ntp_time check (in plugin version 2.2.0) so that it will not alert (send status OK) upon a socket timeout.

The servers we are using this check on often timeout and we are getting a lot of alerts when we don't need/want them.

I am aware of the following
* This is an odd request
* You can mute the alert emails for any check you wish in Nagios
* Maybe I should figure out why the NTP check is timing out and fix the root cause
* There is a "-q" flag you can add to the check
* This check is written in C and you have to compile it before using it

All I need is for someone with more C knowledge than I have to see if they can modify the source code and then just tell/show me where the modifications were made so that I can do the same on my end.

I already tried modifying this (line 566):

Code: Select all

        offset = offset_request(server_address, &offset_result);
        if (offset_result == STATE_UNKNOWN) {
                result = (quiet == 1 ? STATE_UNKNOWN : STATE_CRITICAL);
To this:

Code: Select all

offset = offset_request(server_address, &offset_result);
        if (offset_result == STATE_OK) {
                result = (quiet == 1 ? STATE_OK : STATE_OK);
But this did not work.
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: check_ntp_time Modification Question

Post by mcapra »

Does quiet mode not produce the desired effect?

It's hard for me to navigate the plugin without knowing specifically what outputs you need changed. Could you share your full check_command definition you're currently leveraging (sanitize as needed) as well as some of the CRITICAL check outputs you would like to instead be recognized as "OK" or maybe "UNKNOWN"?
Former Nagios employee
https://www.mcapra.com/
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: check_ntp_time Modification Question

Post by tmcdonald »

What is the exact status that the service gives when you hit the timeout? The way timeouts are handled is not as straight-forward as you might think, and are handled by Core if my assumptions are correct about the message you are seeing. When you see Service check timed out after X seconds as the status, that means the plugin self-terminated after 10 seconds (by default) and Core detected this, giving that message and the critical status. The way around this would be to patch Core to return OK instead, but that would affect all timeouts, not just one service.

Note that this is different from the service_check_timeout_state option in nagios.cfg - that is used for cases where a plugin does not have its own internal timeout, so things don't run forever.
Former Nagios employee
jfrickson

Re: check_ntp_time Modification Question

Post by jfrickson »

Or, you can specify the timeout and status like this: -t 10:OK. Instead of OK you can use OK, 0, WARNING, 1, CRITICAL, 2, UNKNOWN, or 3.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: check_ntp_time Modification Question

Post by tmcdonald »

Or just do what @jfrickson said :) Totally forgot about that flag status option.
Former Nagios employee
PhoneSuite
Posts: 6
Joined: Mon May 04, 2015 5:44 pm

Re: check_ntp_time Modification Question

Post by PhoneSuite »

jfrickson, this suggestion seems to be what I need however it does not appear to be working. Here are some sample inputs and outputs. If you could let me know what I am doing wrong it would be very helpful.

Code: Select all

#./check_ntp_time -t 10:OK -H us.pool.ntp.org -w 1 -c 3
CRITICAL - Socket timeout after 10 seconds

Code: Select all

#./check_ntp_time -t 10:0 -H us.pool.ntp.org -w 1 -c 3
CRITICAL - Socket timeout after 10 seconds

Code: Select all

#./check_ntp_time -V
check_ntp_time v2051 (nagios-plugins 1.4.13)
So it seems that no matter what I do with the -t option I still get a returned state of Critical, not OK as I want. Also trying the -q option does nothing, I get the exact same output "CRITICAL - Socket timeout after 10 seconds" as I get without it.
jfrickson

Re: check_ntp_time Modification Question

Post by jfrickson »

Version 1.4.13? That's over eight years old! That format for the -t timeout parameter did not exist way back then.

Go to plugins/netutils.c and change this:

Code: Select all

/* handles socket timeouts */
void
socket_timeout_alarm_handler (int sig)
{
	if (sig == SIGALRM)
		printf (_("CRITICAL - Socket timeout after %d seconds\n"), socket_timeout);
	else
		printf (_("CRITICAL - Abnormal timeout after %d seconds\n"), socket_timeout);

	exit (STATE_CRITICAL);
}
to this:

Code: Select all

/* handles socket timeouts */
void
socket_timeout_alarm_handler (int sig)
{
	if (sig == SIGALRM)
		printf (_("OK - Socket timeout after %d seconds\n"), socket_timeout);
	else
		printf (_("OK - Abnormal timeout after %d seconds\n"), socket_timeout);

	exit (STATE_OK);
}
PhoneSuite
Posts: 6
Joined: Mon May 04, 2015 5:44 pm

Re: check_ntp_time Modification Question

Post by PhoneSuite »

jfrickson I am so sorry, I downloaded the latest check from Nagios but apparently did not copy the correct file to the correct location. My version is now "check_ntp_time v2.2.0 (nagios-plugins 2.2.0)" and furthermore when I try the -t tag I now get:

Code: Select all

# ./check_ntp_time -t 15:0 -H us.pool.ntp.org -w 1 -c 3
OK - Socket timeout
So your suggestion for using the -t flag was spot on and VERY helpful. Thank you very much. :D
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: check_ntp_time Modification Question

Post by dwhitfield »

It sounds like this issue has been resolved. Is it okay if we lock this thread? Thanks for choosing the Nagios forums!
Locked