Extend service_check_timeout
-
- Posts: 75
- Joined: Wed Dec 26, 2018 2:31 pm
Extend service_check_timeout
Hi,
I am looking at the nagios.cfg and would like to extend my service_check_timeout from 180s to 240s.
Here is the current info:
service_check_timeout=180
host_check_timeout=60
event_handler_timeout=30
notification_timeout=30
ocsp_timeout=5
perfdata_timeout=5
What issues could I run into accomplishing this? I am trying to cut down on false positives, but do not want to create an overlap (someone else cautioned me about this).
I am looking at the nagios.cfg and would like to extend my service_check_timeout from 180s to 240s.
Here is the current info:
service_check_timeout=180
host_check_timeout=60
event_handler_timeout=30
notification_timeout=30
ocsp_timeout=5
perfdata_timeout=5
What issues could I run into accomplishing this? I am trying to cut down on false positives, but do not want to create an overlap (someone else cautioned me about this).
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Extend service_check_timeout
The biggest problem you are going to run into is the fact that is something doesn't return in a timely manner because a packet was dropped or something( such as a UDP packet from a SNMP check) your server is going to leave the process open and wait another 240 seconds before closing the process.
Multiplied by many checks, and your server could hit a process limit and then start killing off processes (not good).
If you legitimately have checks that take 4 minutes to process, it would be better to think of a different way to get those results to your Nagios server (think passive checks) instead of allowing every check to run for up to 4 minutes when usually they should be completing in the sub-second timeframe.
Multiplied by many checks, and your server could hit a process limit and then start killing off processes (not good).
If you legitimately have checks that take 4 minutes to process, it would be better to think of a different way to get those results to your Nagios server (think passive checks) instead of allowing every check to run for up to 4 minutes when usually they should be completing in the sub-second timeframe.
-
- Posts: 75
- Joined: Wed Dec 26, 2018 2:31 pm
Re: Extend service_check_timeout
I guess I am confused as to what to do then. What do you suggest? I am fairly new to the Nagios, I am just learning how to poke around the files with WinSCP and how everything plays together.
-
- Posts: 75
- Joined: Wed Dec 26, 2018 2:31 pm
Re: Extend service_check_timeout
I have timed some of the SNMP requests and they take 20-25 seconds (generally when we aren't getting timeouts). I have been told this is incredibly slow, are there any performance settings I should look at? I just am lacking the knowledge to know where to start to resolve these issues. I have spent hours reading about Nagios.
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Extend service_check_timeout
The delay would be in the equipment you are polling. I would look to see if this equipment is experiencing heavy CPU utilization.chris1337c wrote:are there any performance settings I should look at?
If these are SNMP checks, I would strongly suggest not increasing the service timeout, but maybe increase the max check attempts on the services to help prevent them from sending notification email if it is just getting a timeout on some of the checks.
-
- Posts: 75
- Joined: Wed Dec 26, 2018 2:31 pm
Re: Extend service_check_timeout
The equipment is absolutely experiencing high CPU loads when our VEEAM backups are running.
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Extend service_check_timeout
This could be causing the delay in receiving, I would suggest increase the max check attempts on the services that are affected to help curb false positive notifications.chris1337c wrote:The equipment is absolutely experiencing high CPU loads when our VEEAM backups are running.
-
- Posts: 75
- Joined: Wed Dec 26, 2018 2:31 pm
Re: Extend service_check_timeout
I will give this a try, I didn't consider doing this. Thank you for the pointer.
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Extend service_check_timeout
no problemchris1337c wrote:I will give this a try, I didn't consider doing this. Thank you for the pointer.
-
- Posts: 75
- Joined: Wed Dec 26, 2018 2:31 pm
Re: Extend service_check_timeout
When it time's out, it retries I believe 3 times. Is this specified in the main nagios.cfg or is there a service.cfg file somewhere that houses this, I have scrolled through the config a few times and am not seeing this setting. I know exactly what it does as I have seen it on the GUI.