Page 1 of 2
Service check timed out after 60.01 seconds
Posted: Mon Sep 06, 2021 5:41 am
by Mahesh786
Hi Team,
We are getting Service check timed out errors frequently on NCPA agent server.
Please check and let us know how we can resolve the issue.
Alerts: (Service check timed out after 60.01 seconds) on Log Keyword for ucprs4apprd05 ucprs4apprd05 is CRITICAL
Regards,
Venkata Reddy
Re: Service check timed out after 60.01 seconds
Posted: Tue Sep 07, 2021 1:53 pm
by benjaminsmith
Hi Venkata,
Greetings! Thanks for contacting the support team at Nagios.
In most cases, this is the result of a firewall or access issue. Are you seeing the timeout on just one service or multiple services?
Is the timeout intermittent? If so then it's likely caused by network congestion?
Another possibility is the plugin is taking too long to return data, this can be caused by a number of factors (i.e. slow or unresponsive server).
Can you post the check command or share the system profile and let us know the exact name of the service that is timing out.
--Benjamin
### TO Download a System Profile
To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Re: Service check timed out after 60.01 seconds
Posted: Wed Sep 08, 2021 12:17 am
by Mahesh786
Hi,
Please find the below:
In most cases, this is the result of a firewall or access issue. Are you seeing the timeout on just one service or multiple services?-Yes, we are getting multiple services.
Is the timeout intermittent? If so then it's likely caused by network congestion?-Yes timeout intermittent
Can you post the check command or share the system profile and let us know the exact name of the service that is timing out.- Profile has been attached and it is for all the services for ucprnwcsprd02,ucprs4apprd05 servers
Regards,
Venkata REddy
Moderator note: removed attached profile and placed it on local shared drive
Re: Service check timed out after 60.01 seconds
Posted: Wed Sep 08, 2021 11:04 am
by pbroste
Hello @Mahesh786
Thanks for following up, and after a review of the System Profile we are not seeing anything that is a defined pain point.
A couple things we would want to do to pin this down, let's increase the service check timeout(s) by editing that section of /usr/local/nagios/etc/nagios.cfg from:
- service_check_timeout=60
to
- service_check_timeout=120
- Save the file and restart nagios by running:
[list]- service nagios restart
[/list]
Let's also find out what the command output results look like:
Code: Select all
/usr/local/nagios/libexec/check_ncpa.py -H <ipaddressorhostnameofucprs4apprd05> -t 'UltraTechXi' -P 5693 -M 'memory/virtual/percent' --verbose
Option to add timeout on the command for results:
Code: Select all
/usr/local/nagios/libexec/check_ncpa.py -H <ipaddressorhostnameofucprs4apprd05> -t 'UltraTechXi' -P 5693 -M 'memory/virtual/percent' --verbose --timeout=xxx
Please let me know the results,
Perry
Re: Service check timed out after 60.01 seconds
Posted: Thu Sep 09, 2021 2:12 am
by Mahesh786
Hi,
In ncpa.cfg file we are unable to identify the service_check_timeout=60.
Please find the attached ncpa.cfg file and let us know what changes, we need to exactly.
Regards,
Venkata Reddy
Re: Service check timed out after 60.01 seconds
Posted: Thu Sep 09, 2021 1:04 pm
by pbroste
Hello @Mahesh786
Thanks for following up, you are correct that there is a timeout config in the ncpa.cfg, but let's increase the host and service timeout in nagios.cfg.
To increase the service check timeout(s) by editing that section of
/usr/local/nagios/etc/nagios.cfg from:
service_check_timeout=60
to
service_check_timeout=120
Save the file and restart nagios by running:
service nagios restart
The option to increase the timeout in the ncpa check by going to
/usr/local/ncpa/etc/ncpa.cfg:
Change the line:
# plugin_timeout = 60'
To:
plugin_timeout = 120
Then restart nagios.service
Thanks,
Perry
Re: Service check timed out after 60.01 seconds
Posted: Mon Sep 13, 2021 9:18 pm
by Mahesh786
HI Team,
We have increased the plugin_timeout = 120 in ncpa.cfg file and restarted the nagios services but still timeout alerts are generating.
Please check and suggest if any changes need to be performed.
Regards,
Venkata Reddy
Re: Service check timed out after 60.01 seconds
Posted: Tue Sep 14, 2021 1:16 pm
by pbroste
Hello Venkata,
Thanks for following up, sounds like you are referencing two hosts that are timing out and want to go ahead and get "tailed" details that provide event timeline:
Code: Select all
while :; do find /usr/local/nagios/var/ -name "*.*" -not -path "/usr/local/nagios/var/rw/*" | xargs tail -F | grep -Ei "warn|error|fail|unknown|critical|ucprs4apprd05" >> /tmp/loggingit.txt; sleep 1; done
Please run until a timeout has been logged {ctl-c to breakout} and send over the '/tmp/loggingit.txt' via Private Message [PM].
Thanks,
Perry
Re: Service check timed out after 60.01 seconds
Posted: Tue Sep 21, 2021 2:59 am
by Mahesh786
Hi Team,
We unable to execute cmd and getting the error.
Please find the attached template.
Regards,
Venkata Reddy
Re: Service check timed out after 60.01 seconds
Posted: Tue Sep 21, 2021 1:34 pm
by pbroste
Let's go with this one:
Code: Select all
tail -Fn0 /var/log/httpd/* /var/log/apache2/* /usr/local/nagios/var/* /usr/local/nagiosxi/tmp/* /usr/local/nagiosxi/var/* /var/log/syslog /var/log/messages /usr/local/nagios/var/spool/* /usr/local/nagiosxi/var/components/* | grep -Ei "warn|error|fail|unknown|critical|ucprs4apprd05" >> /tmp/results.txt
Thanks,
Perry