Page 1 of 1

nrpe - log_on_failure options

Posted: Tue Aug 14, 2018 9:23 am
by thanstra
Can someone point me to information on how log_on_failure works and what options there might be for it?

Every piece of documentation which I see uses this line:

log_on_failure += USERID

But I don't know what that means, especially the USERID part.

The underlying reason I'm looking into this is that I am running into times when nrpe loses connection with my Nagios server and it seems to be related in some way to authentication on the server itself. I don't follow how authentication errors can effect NRPE connections except, perhaps, though this log_on_failure piece of the puzzle.

Anyone have ideas?

Re: nrpe - log_on_failure options

Posted: Wed Aug 15, 2018 2:48 pm
by cdienger
https://linux.die.net/man/5/xinetd:
...
log_on_failure

determines what information is logged

when a server cannot be started (either because of a lack of resources or because of access control restrictions). The service id is always included in the log entry along with the reason for failure. Any combination of the following values may be specified:

...

USERID

logs the user id of the remote user using the RFC 1413 identification protocol. This option is available only for multi-threaded stream services.
...

It would seem that the server cannot get started. Is there anything of interest in /var/log/messages during times of failure?

Re: nrpe - log_on_failure options

Posted: Fri Aug 17, 2018 2:27 pm
by thanstra
All I get on failure is the usual "cannot complete handshake" errors. Nothing more to help me understand what is going on.

Re: nrpe - log_on_failure options

Posted: Fri Aug 17, 2018 4:17 pm
by cdienger
What version of the agent and check? On the XI side:

/usr/local/nagios/libexec/check_nrpe -V

on the agent:

/usr/local/nagios/bin/nrpe -V

updating the agent and/or check may help if you suspect this is something to do with either of these, otherwise system logs in the client's /var/log, may have other useful details. Is the system experiencing other issues(high load, slowness, errors, etc..) when this is occurring?