Built a custom plugin; issues with returned status codes

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
pamplifier
Posts: 21
Joined: Wed May 13, 2015 7:31 am

Built a custom plugin; issues with returned status codes

Post by pamplifier »

I build a custom plugin in C with the standard return codes in place (0 for OK, 1 for WARNING, etc) and printing one sentence to stdout. Executing it normally with the required flags spits out the sentence correctly.

Code: Select all

# ./check_thing -w 10 -e 10
# "Status code: WARNING etc etc <insert relevant string here>"
However, when I threw in the required lines in the configuration files to watch the plugin status on the Nagios web-interface, it keeps coming back with "Return code of 139 is out of bounds."

Here is what I have in commands.cfg:

Code: Select all

/*custom plugin*/
define command {
        command_name check_thing
        command_line $USER1$/check_thing -w $ARG1$ -e $ARG2$
        }
and in localhost.cfg:

Code: Select all

define service {
        use                             generic-service
        host_name                       localhost
        service_description             check_thing
        check_command                   check_thing!10!5
        notifications_enabled           1
        }
and here is its properties under ls -la

Code: Select all

-rwxr-xr-x  1 nagios nagcmd  17968 Jul 24 10:32 check_thing
 
Initially the file permissions for the executable went to root, but I changed it using chown nagios: check_thing

Can anyone help me figure this out?
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Built a custom plugin; issues with returned status codes

Post by tmcdonald »

When you are running it from the CLI are you running it as root or as nagios? Environment variables could be different. Code 139 often means a segfault, so if you wanna post the code I might be able to take a look and point out some problem lines.
Former Nagios employee
pamplifier
Posts: 21
Joined: Wed May 13, 2015 7:31 am

Re: Built a custom plugin; issues with returned status codes

Post by pamplifier »

This is essentially its first draft, there's a lot to do still, but I wanted to get a bare minimum form going to see how it works in the web interface.

Code: Select all

const char *progname = "check_thing";
const char *copyright = "";
const char *email ="";

#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <getopt.h>

int process_arguments(int, char**);
struct msg parse_msg (char *);
void print_help(void);
void print_usage(void);

static int wcount = 0;
static int ecount = 0;
static long loffset;
static long moffset;

int verbose = 0;
char *logpath = NULL;
static char *foundlog = "foundlog.txt";

int warthresh;  /*If number of warnings surpass this, return CRITICAL*/ 
int errthresh;  /*If number of errors surpass this, return CRITICAL*/

const int ERROR = -1;
const int STATUS_OK, OK = 0;
const int STATUS_WARNING = 1;
const int STATUS_CRITICAL = 2;
const int STATUS_UNKNOWN = 3;

struct msg{
	char time[16];
	char fac[20];
	char prefix[20];
	char body[100];
	int seconds;
};

int main(int argc, char **argv){
	int result = STATUS_UNKNOWN;
	struct msg mess;
	char line[156];
	char *fline;

	FILE* thingfile = fopen(foundlog, "a+");
	
	if (process_arguments(argc, argv) == ERROR){
		printf("Could not parse arguments\n");
			print_usage();
			exit(STATUS_UNKNOWN);
	}
	FILE* logfile = fopen(logpath, "r");
	fseek(logfile, loffset, SEEK_SET);
	fseek(thingfile, moffset, SEEK_SET);

	while (fgets(line, 156, logfile)){
		if (strstr((const char *)line, "sshd[")){
			/*record line in file*/
			fline = strncat(line, "\n", 1);
			fputs(fline, thingfile);
			mess = parse_msg(fline);
			if (!strncmp(mess.body, "Warning:", 8)){
				wcount++;
			} else { 
				ecount++;
			}
		}
	}

	loffset=ftell(logfile);
	moffset=ftell(thingfile);

	fclose(logfile);

	if (wcount > 0 || ecount > 0){
		if (wcount >= warthresh || ecount >= errthresh){
			result = STATUS_CRITICAL;
		}else {
			result = STATUS_WARNING;
		}
	}else {
		result = STATUS_OK;
	}

	/*First line of output; stored in $SERVICEOUTPUT*/
	printf ("status : %s", (result == STATUS_OK) ? ("OK") : ("WARNING"));
	printf (", Report : %d uncleared warnings and %d uncleared errors logged in %s\n", 
		wcount, ecount, logpath);

	/*Verbose output; stored in $LONGSERVICEOUTPUT*/
	if (verbose){
		printf ("Last reports from %s: \n", logpath); /*TODO*/
	}
	fclose(thingfile);
	return result;
}

struct msg parse_msg(char *line){
	struct msg mess;

	sscanf(line, "%15[a-zA-Z0-9: ]s", mess.time);
	mess.seconds = atoi(mess.time+13);
	sscanf((line+15), "%s %s %[^\n]s", mess.fac, mess.prefix, mess.body);

	return mess;
}

int process_arguments(int argc, char **argv){
	int c;	
	int option = 0;
	static struct option longopts[] = {
		{"filename", required_argument, 0, 'F'},
		{"warning count", required_argument, 0, 'w'},
		{"error count", required_argument, 0, 'e'},
		{"log", optional_argument, 0, 'l'},
		{"help", no_argument, 0, 'h'},
		{"verbose", no_argument, 0, 'v'},
		{NULL, 0, 0, 0}
	};

	/*if no arguments were passed*/
	if (argc < 2)
		return ERROR;

	while(1){
		c = getopt_long(argc, argv, "+hvF:w:e:", longopts, &option);

		if (c==-1 || c == EOF || c ==1)
			break;

		switch(c) {
			case 'h':
				print_help();
				exit (STATUS_OK);
			case 'F':
				logpath = optarg;
				break;
			case 'w':
				warthresh = atoi(optarg);
				break;
			case 'e':
				errthresh = atoi(optarg); 
				break;
			case 'v':
				verbose++;
				break;
			default:
				print_usage();
				exit(STATUS_UNKNOWN);
		}	
	}
	/*set mainlogfile to default syslog file if not user-defined*/
	if(logpath==NULL){
		logpath = "/var/log/messages";
	}
	return 0;
}

void print_help(void){
	printf("print relevant help line here \n");
	printf("\n\n");
	print_usage();
}

void print_usage(void){
	printf("\nUsage:\n");
		printf("%s -F <logfile> -w <warn threshold> -e <error threshold> \n", progname);

}
Also, yes, when I execute it to test it, I use root.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Built a custom plugin; issues with returned status codes

Post by tmcdonald »

While I look through the code, try running it as the nagios user and see what you get. That's a more accurate test of what happens when it is run for real.
Former Nagios employee
pamplifier
Posts: 21
Joined: Wed May 13, 2015 7:31 am

Re: Built a custom plugin; issues with returned status codes

Post by pamplifier »

Ah yes, there is a segmentation fault when running as the nagios user. Whoops, I didn't think to test it like this!
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Built a custom plugin; issues with returned status codes

Post by tmcdonald »

Just as some general pointers, I would modify the output to read like this:

WARNING: 0 uncleared warnings and 3 uncleared errors logged in /tmp/somelog

It is more in line with what other plugins would output and it's a bit neater. And if you want graphable output, you can do:

WARNING: 0 uncleared warnings and 3 uncleared errors logged in /tmp/somelog|warn=0 err=3

Everything after the | will be performance data:

https://nagios-plugins.org/doc/guidelines.html#AEN200

Might also want to massage the -w and -e options a bit. Standard usage is to reserve -w for warning thresholds and -c for critical thresholds. It was a little confusing reading the code because at first I thought you were using the word "error" to refer to a Nagios "critical" state.
Former Nagios employee
pamplifier
Posts: 21
Joined: Wed May 13, 2015 7:31 am

Re: Built a custom plugin; issues with returned status codes

Post by pamplifier »

Thank you, I'll make those changes. Like I said, this is still a work in progress, so there is a lot of formatting to do, and a lot of error-checking etc. I just wanted to see if I could get a bare-bones plugin running to see how it could appear on the web-interface.
pamplifier
Posts: 21
Joined: Wed May 13, 2015 7:31 am

Re: Built a custom plugin; issues with returned status codes

Post by pamplifier »

Bear in mind, pattern matching for 'sshd[' is a placeholder for now, I just needed a substring that already existed in multiple messages on the syslog of my test machine.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Built a custom plugin; issues with returned status codes

Post by tmcdonald »

I assumed as much, and was able to generate some test logfiles based off of that.

For the time-being if it's alright with you, I'll let you work on the plugin and we can keep this thread open. Since we have a "todo" list of things to reply to, please post back only if you have a question so we don't have to give a bogus "Cool thanks" reply :)
Former Nagios employee
pamplifier
Posts: 21
Joined: Wed May 13, 2015 7:31 am

Re: Built a custom plugin; issues with returned status codes

Post by pamplifier »

One question, I just realized that I need root privileges to access the logs in /var/log/. I didn't notice because I'm almost always running as root on my test machine. Could the seg fault be due to the fact that the nagios user is trying to access the logfiles when it doesn't have permission to?

Right now I'm looking up ways to ameliorate this, including how Nagios' does it with its pre-existing plugins in /plugins-root. I'm thinking of adding a new rule to /etc/sudoers, is that the best solution for this?
Locked