Service Check Timed Out for one Service

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
amitgupta19
Posts: 286
Joined: Fri Sep 08, 2017 5:53 am

Service Check Timed Out for one Service

Post by amitgupta19 »

Suddenly i have started getting the Alert for one of the service as: "Service Check Timed out" .

This is occurring very frequently.

I would like to know what could be the reason for it?

I have seen the various thread w.r.t this problem and they suggest increasing the time out value. Currently the value is set at 420.
How much should i increase it ?

But is it recommended to increase the timeout value?
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Service Check Timed Out for one Service

Post by mcapra »

Many plugins include their own specific flags and methods for handling timeouts, but they do not take precedence over Nagios Core's internal workings when checks are executed.

"Service check timed out after x seconds" means the plugin's execution time exceeded the Nagios configuration's service_check_timeout setting, so Nagios stopped it. This is a global configuration directive that affects all service checks.

You could certainly increase this value, but depending on your check interval this can lead to overlapping checks and generally bad things. Historically, it's been recommended that checks which run for an exceptionally long time be scheduled as cron jobs which submit their results to Nagios Core passively.

Can you share the service definition, its corresponding command definition, and the plugin your command definition is using?
Former Nagios employee
https://www.mcapra.com/
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Service Check Timed Out for one Service

Post by npolovenko »

Agreed with @mcapra. 420 seconds already seems like a lot to me. Please send us all the information he requested so that we can identify whats going on.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
amitgupta19
Posts: 286
Joined: Fri Sep 08, 2017 5:53 am

Re: Service Check Timed Out for one Service

Post by amitgupta19 »

Please find attached the required Data.

I just want to identify what is suddenly causing this error.

So that i can correct the Problem.
Attachments
QueueHealth.txt
Plugin
(2.7 KiB) Downloaded 540 times
Service Check Timed Out.docx
Service Definition and Command Definition
(39.39 KiB) Downloaded 284 times
amitgupta19
Posts: 286
Joined: Fri Sep 08, 2017 5:53 am

Re: Service Check Timed Out for one Service

Post by amitgupta19 »

Did i missed any information/Data ?

Can someone please look into it?
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Service Check Timed Out for one Service

Post by mcapra »

The -t 420 you've added to your service check is technically correct, but as I said this won't override Nagios Core's service_check_timeout setting. Increasing this setting has implications on each and every one of your checks, so if you do change it, I would suggest you change it with care and diligence.

From the plugin's notes, it admits that the snapin itself takes a while to load:

Code: Select all

# On the check_nrpe command include the -t 30, since it takes some time to load the Exchange cmdlet's.
I'm going to again suggest this be setup as a passive check due to its long runtime:
mcapra wrote: Historically, it's been recommended that checks which run for an exceptionally long time be scheduled as cron jobs which submit their results to Nagios Core passively.
Former Nagios employee
https://www.mcapra.com/
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Service Check Timed Out for one Service

Post by dwhitfield »

mcapra wrote:this won't override Nagios Core's service_check_timeout setting.
This is in the nagios.cfg. Can you attach that for review?
amitgupta19
Posts: 286
Joined: Fri Sep 08, 2017 5:53 am

Re: Service Check Timed Out for one Service

Post by amitgupta19 »

In the nagios.cfg service_check_timeout is mentioned as 110 .

But my point is that suddenly what was changed, so it started giving the service check timeout.

Also would like to inform you that now i am not receiving any service check timeout Errors.

Again stating what caused for the sudden error ??

Kindly suggest, how can i identify this?
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Service Check Timed Out for one Service

Post by mcapra »

Based on my *very limited* reading on the subject, that particular snapin (Microsoft.Exchange.Management.PowerShell.E2010) is know to be slow. There's several different solutions scattered around Google with things you can do to speed it up. I can't vouch for their success.

Simply put, the check_queue_health plugin has room for improvement. Historically, it's been recommended that checks which run for an exceptionally long time be scheduled as cron jobs which submit their results to Nagios Core passively. This would allow you to work around the long run-time without worrying about Nagios Core internals.
Former Nagios employee
https://www.mcapra.com/
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Service Check Timed Out for one Service

Post by dwhitfield »

amitgupta19 wrote:In the nagios.cfg service_check_timeout is mentioned as 110 .
110 only shows up once on this page (which is really beside the point, thanks for giving the value), but if you want the timeout to be 420, you need to change it in the nagios.cfg. Of course, @mcapra mentioned that this will change for each check.

As far as why this is occurring now, and wasn't before, I would suspect more load on the Windows side considering that is known to be a slow snapin, but if you added more checks on the Nagios side, that could be an issue as well.
Locked