Page 1 of 1

NRPE checks randomly error out

Posted: Fri Jun 07, 2013 12:13 pm
by theace18
Hey Everyone,

I'm experiencing some weirdness on a particular host. About 3-4 times a day all NRPE checks on a particular host all of a sudden go critical.

I've gone in and restarted the NRPE daemon, but that doesn't seem to fix it.I don't see any weirdness regarding NRPE in /var/log/messages. Any thoughts?

Re: NRPE checks randomly error out

Posted: Fri Jun 07, 2013 12:59 pm
by slansing
What is the error code that is returned when they go critical?

Re: NRPE checks randomly error out

Posted: Fri Jun 07, 2013 3:08 pm
by theace18
It'll say:

CHECK_NRPE: Socket timeout after 10 seconds.

What's funny is that the server isn't being taxed, and it never looses ping.

Re: NRPE checks randomly error out

Posted: Fri Jun 07, 2013 3:45 pm
by slansing
So this indicates that there could be a network related issue occurring then, either on the port forwarding and, or the actual connection, does your host go down at this time as well?? Either that, or something is keeping the check from returning it's data within 10 seconds, to test this, try adding a:

Code: Select all

-t 30
To the service definition after the $HOSTADDRESS$ or IP, this will increase the timeout, then verify the config, and see if you have this issue again.

Re: NRPE checks randomly error out

Posted: Mon Jun 10, 2013 10:25 am
by theace18
Well it turns out that the hard disk on my Hypervisor is going bad. The Hypervisor would randomly just eat up disk I/O and cause only the NRPE checks to error out, even though on the VM server itself, there was no load. Weird.

Anyhow. Thanks for all the help!

Re: NRPE checks randomly error out

Posted: Mon Jun 10, 2013 10:53 am
by slansing
No problem, at least you were able to figure out what it was! Closing.