(Service Check Timed Out)
(Service Check Timed Out)
I have an on-prem instance of Nagios XI in northern california. I am monitoring a debian instance in AWS east. I am getting intermittent errors for service checks. Initially they were "CHECK_NRPE: Socket timeout after 30 seconds" errors. This happened every few hours. When I updated the command from "-t 30" to "-t 60", this changed the error message to (Service Check Timed Out). I think this is a latency issue. What can I do to prevent false positives?
Re: (Service Check Timed Out)
If NRPE is running as a standalone daemon on the client machine, make sure you have the Nagios XI server's IP address added to the nrpe.cfg:
If NRPE is running under xinetd, make sure you add the Nagios XI server's IP to the "/etc/xinetd.d/nrpe":
Restart the daemon/xinetd so that changes can take effect.
What is the output of the following command, ran on the Nagios XI server?
Code: Select all
allowed_hosts=127.0.0.1,<nagios server ip>Code: Select all
only_from = 127.0.0.1 <Nagios server ip>What is the output of the following command, ran on the Nagios XI server?
Code: Select all
/usr/local/nagios/libexec/check_nrpe -H <client ip>Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: (Service Check Timed Out)
Allowed_hosts is fine as the check works 90% of the time. I dont need to adjust that. My question is more geared towards increasing the timeout to account for potential network latency issues.
Re: (Service Check Timed Out)
It's not going to be enough to change the timeout value on the server side of things but you will also need to change it on the client side. Read section IV in this document:
http://assets.nagios.com/downloads/nagi ... utions.pdf
http://assets.nagios.com/downloads/nagi ... utions.pdf
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: (Service Check Timed Out)
On the server side, it is sufficient to change that check_nrpe -t value on the command? Or is there another setting that needs to be updated as well.
I will take a look at that attachment. Thanks.
I will take a look at that attachment. Thanks.
Re: (Service Check Timed Out)
There is a max command and connection timeout in the remote host's nrpe.cfg that may need to be altered:
Connection timeout should be a little larger than the command timeout. Additionally, do not forget to restart the daemon after making these changes:
Or:
Code: Select all
command_timeout=<timeout>
connection_timeout=<timeout>Code: Select all
service xinetd restartCode: Select all
service nrpe restartFormer Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.