check_disk not timing out!

Post by **BanditBBS** » Wed May 20, 2015 2:59 pm

abrist wrote:Then stat() is definitely failing. I thought it was odd that the alarm as not triggering on timeout. . .well, there was no alarm
So I added one. Can you test this new bin to see if it times out?

AHH, even worse! It isn't timing out and I can't about the application anymore(started new session and killed it)

Post by **BanditBBS** » Wed May 20, 2015 3:02 pm

Weird....I can abort before the timeout value but if I try after the timeout value I can not ctrl-c it.

abrist · Post by **abrist** » Wed May 20, 2015 3:04 pm

That is seriously odd.
Can we *pretty please* have a remote session?

abrist · Post by **abrist** » Wed May 20, 2015 3:06 pm

Ah, ok, I will do some more digging.

Post by **BanditBBS** » Wed May 20, 2015 3:07 pm

Yes Andy, we can do remote, but no compiling on that box....guess we could compile on the nagios server first and xfer the file.

abrist · Post by **abrist** » Wed May 20, 2015 5:00 pm

Lets wait on that remote for now. Try the following program I whipped up, it should check for a stale mount:

Code: Select all

#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <iso646.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>


int
main (int argc, char **argv) {
    struct stat st;
    char* mount_point;
    int ret;
    int option = 0;
    while ((option = getopt(argc, argv,"m:")) != -1) {
        switch (option) {
             case 'm' : mount_point = optarg;
             break;
        }
    }

    ret = stat(mount_point, &st);
    if(ret == -1 and errno == ESTALE){
        printf("Mount: %s is stale\n",mount_point);
        return EXIT_SUCCESS;
    } else {
        printf("Mount: %s is not stale\n",mount_point);
        return EXIT_FAILURE;
    }
}

Save it to stale.c.

Code: Select all

gcc stale.c
./a.out -m <mount point>

Or just try running the bin attached. I want to see if we can detect the stale mount point.

Post by **BanditBBS** » Wed May 20, 2015 8:19 pm

It hangs and I have to abort the process Andy.

abrist · Post by **abrist** » Thu May 21, 2015 9:16 am

Well, Dang. I guess we could just fork a child process (to stat the mount) and kill the pid after a timeout (if it is stale). I don't think any of the other nagios-plugins use forks though, so this may not generally be a good idea. A number of them do. I will look into forking stat() and then running a timeout on the child process - allowing us to kill it after timeout and exit critical. Sound fair?

Anyone out there have any other good ideas for detecting stale mounts using C?

Post by **BanditBBS** » Thu May 21, 2015 9:46 am

abrist wrote:Well, Dang. I guess we could just fork a child process (to stat the mount) and kill the pid after a timeout (if it is stale). I don't think any of the other nagios-plugins use forks though, so this may not generally be a good idea. A number of them do. I will look into forking stat() and then running a timeout on the child process - allowing us to kill it after timeout and exit critical. Sound fair?

Anyone out there have any other good ideas for detecting stale mounts using C?

Definitely sounds fair to me Andy. Really, that seems about the best method and the other benefit would be I can remove my NFS check and just use check_disk to alert on stale mounts! That'll remove 1000 + checks for me

abrist · Post by **abrist** » Thu May 21, 2015 9:59 am

I am worried though that we would have to fork() for each mount checked, track the pids, and then report on those that timeout or something. This could actually get a bit complicated, and add to the overhead of check_disk. I may need to think about this a bit.

Nagios Support Forum

check_disk not timing out!

Re: check_disk not timing out!

Re: check_disk not timing out!

Re: check_disk not timing out!

Re: check_disk not timing out!

Re: check_disk not timing out!

Re: check_disk not timing out!

Re: check_disk not timing out!

Re: check_disk not timing out!

Re: check_disk not timing out!

Re: check_disk not timing out!