Page 1 of 1
Service check timed out
Posted: Thu Jul 22, 2021 4:34 am
by preethu.d
Hi,
I am getting multiple service check timeout error ((Service check timed out after 60.01 seconds)) on a node in Nagios and it is auto-resolving in sometime. It is a windows server, and the issue is only for this servers, what is the reason for this issue and how this can be resolved.
Regards,
Preethu
Re: Service check timed out
Posted: Thu Jul 22, 2021 1:21 pm
by benjaminsmith
Hi Preethu,
From your description, it sounds like the problem comes and goes. Typically in this type of situation, this is caused by a slow network to the remote host.
You can run a large number of ping checks to verify this, for example.
Code: Select all
ping -c 100 192.168.23.111 > /tmp/ping_log_23.111.txt
Let me know the results of the test are.
Regards,
Benjamin
Re: Service check timed out
Posted: Fri Jul 23, 2021 1:25 am
by preethu.d
Hi Benjamin,
I have sent you the ping check results. Please check.
Regards,
Preethu
Re: Service check timed out
Posted: Fri Jul 23, 2021 10:18 am
by benjaminsmith
HI Preethu,
Thanks for sending that over, those numbers look okay. I think it must be failing on occasion.
I would recommend increasing the
max check attempts to help smooth out occasional network congestion. This will help reduced false positives (getting an alert or notification when the service or host is okay).
max-checks.png
--Benjamin
Re: Service check timed out
Posted: Mon Jul 26, 2021 1:06 am
by preethu.d
Hi Benjamin,
Currently it is set as 5,1,1 if I change to 5,1,5 how frequently Nagios check the status ?
Regards,
Preethu
Re: Service check timed out
Posted: Mon Jul 26, 2021 12:08 pm
by ssax
5,1,1
check_interval - checked every 5 minutes
retry_interval - if problem detected check every 1 minutes, max_check_attempts number of times
max_check_attempts - 1 time
5,1,5
check_interval - checked every 5 minutes
retry_interval - if problem detected check every 1 minutes, max_check_attempts number of times
max_check_attempts - 5 times
Benjamin is having you increase the max_check_attempts to try to get around false positives.
With 5,1,1 if a problem is detected it immediately will go to a hard state and send the notification.
With 5,1,5 if a problem is detected it will recheck it every 1 minute up to 5 times. If the service is still in a problem state on the last one it will then set it to a hard state, only then will the notification be sent. This will help stop false positives from notifying because the ones that resolve themselves would be SOFT states where notifications are not sent. Notifications are only sent on hard states.
See here:
https://assets.nagios.com/downloads/nag ... types.html
https://assets.nagios.com/downloads/nag ... tions.html
Re: Service check timed out
Posted: Wed Jul 28, 2021 12:29 am
by preethu.d
Thanks for the help.
Regards,
Preethu
Re: Service check timed out
Posted: Wed Jul 28, 2021 1:11 pm
by benjaminsmith
Hi Preethu,
Let us know if you have further questions or if it's okay to close this topic.
-Benjamin
Re: Service check timed out
Posted: Wed Jul 28, 2021 11:16 pm
by preethu.d
Hi Benjamin,
You can close the topic.
Thank you
Regards,
Preethu
Re: Service check timed out
Posted: Thu Jul 29, 2021 10:20 am
by benjaminsmith
HI Preethu,
Great. Thanks for the update.