Service Check Timed Out for one Service

An open discussion forum for obtaining help with Nagios Core. Nagios Core users of all experience levels are welcome here. Subforum have been created for the discussion of Nagios Core and Nagios Plugin development.

NOTE: The SourceForge.net mailing lists have been deprecated in favor of this forum in order to expedite support and provide additional features not available on the old mailing list.

Service Check Timed Out for one Service

Postby amitgupta19 » Thu Jan 18, 2018 7:50 am

Suddenly i have started getting the Alert for one of the service as: "Service Check Timed out" .

This is occurring very frequently.

I would like to know what could be the reason for it?

I have seen the various thread w.r.t this problem and they suggest increasing the time out value. Currently the value is set at 420.
How much should i increase it ?

But is it recommended to increase the timeout value?
amitgupta19
 
Posts: 61
Joined: Fri Sep 08, 2017 5:53 am

Re: Service Check Timed Out for one Service

Postby mcapra » Thu Jan 18, 2018 1:05 pm

Many plugins include their own specific flags and methods for handling timeouts, but they do not take precedence over Nagios Core's internal workings when checks are executed.

"Service check timed out after x seconds" means the plugin's execution time exceeded the Nagios configuration's service_check_timeout setting, so Nagios stopped it. This is a global configuration directive that affects all service checks.

You could certainly increase this value, but depending on your check interval this can lead to overlapping checks and generally bad things. Historically, it's been recommended that checks which run for an exceptionally long time be scheduled as cron jobs which submit their results to Nagios Core passively.

Can you share the service definition, its corresponding command definition, and the plugin your command definition is using?
Former Nagios employee
http://www.mcapra.com/
User avatar
mcapra
 
Posts: 3083
Joined: Thu May 05, 2016 3:54 pm

Re: Service Check Timed Out for one Service

Postby npolovenko » Fri Jan 19, 2018 4:28 pm

Agreed with @mcapra. 420 seconds already seems like a lot to me. Please send us all the information he requested so that we can identify whats going on.
User avatar
npolovenko
Support Tech
 
Posts: 1319
Joined: Mon May 15, 2017 5:00 pm

Re: Service Check Timed Out for one Service

Postby amitgupta19 » Mon Jan 22, 2018 4:24 am

Please find attached the required Data.

I just want to identify what is suddenly causing this error.

So that i can correct the Problem.
Attachments
QueueHealth.txt
Plugin
(2.7 KiB) Downloaded 7 times
Service Check Timed Out.docx
Service Definition and Command Definition
(39.39 KiB) Downloaded 7 times
amitgupta19
 
Posts: 61
Joined: Fri Sep 08, 2017 5:53 am

Re: Service Check Timed Out for one Service

Postby amitgupta19 » Tue Jan 23, 2018 7:21 am

Did i missed any information/Data ?

Can someone please look into it?
amitgupta19
 
Posts: 61
Joined: Fri Sep 08, 2017 5:53 am

Re: Service Check Timed Out for one Service

Postby mcapra » Tue Jan 23, 2018 2:23 pm

The -t 420 you've added to your service check is technically correct, but as I said this won't override Nagios Core's service_check_timeout setting. Increasing this setting has implications on each and every one of your checks, so if you do change it, I would suggest you change it with care and diligence.

From the plugin's notes, it admits that the snapin itself takes a while to load:
Code: Select all
# On the check_nrpe command include the -t 30, since it takes some time to load the Exchange cmdlet's.


I'm going to again suggest this be setup as a passive check due to its long runtime:
mcapra wrote:Historically, it's been recommended that checks which run for an exceptionally long time be scheduled as cron jobs which submit their results to Nagios Core passively.
Former Nagios employee
http://www.mcapra.com/
User avatar
mcapra
 
Posts: 3083
Joined: Thu May 05, 2016 3:54 pm

Re: Service Check Timed Out for one Service

Postby dwhitfield » Tue Jan 23, 2018 4:13 pm

mcapra wrote:this won't override Nagios Core's service_check_timeout setting.


This is in the nagios.cfg. Can you attach that for review?
dwhitfield
Former Nagios Staff
 
Posts: 4568
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN

Re: Service Check Timed Out for one Service

Postby amitgupta19 » Thu Jan 25, 2018 8:59 am

In the nagios.cfg service_check_timeout is mentioned as 110 .

But my point is that suddenly what was changed, so it started giving the service check timeout.

Also would like to inform you that now i am not receiving any service check timeout Errors.

Again stating what caused for the sudden error ??

Kindly suggest, how can i identify this?
amitgupta19
 
Posts: 61
Joined: Fri Sep 08, 2017 5:53 am

Re: Service Check Timed Out for one Service

Postby mcapra » Thu Jan 25, 2018 10:17 am

Based on my *very limited* reading on the subject, that particular snapin (Microsoft.Exchange.Management.PowerShell.E2010) is know to be slow. There's several different solutions scattered around Google with things you can do to speed it up. I can't vouch for their success.

Simply put, the check_queue_health plugin has room for improvement. Historically, it's been recommended that checks which run for an exceptionally long time be scheduled as cron jobs which submit their results to Nagios Core passively. This would allow you to work around the long run-time without worrying about Nagios Core internals.
Former Nagios employee
http://www.mcapra.com/
User avatar
mcapra
 
Posts: 3083
Joined: Thu May 05, 2016 3:54 pm

Re: Service Check Timed Out for one Service

Postby dwhitfield » Thu Jan 25, 2018 11:17 am

amitgupta19 wrote:In the nagios.cfg service_check_timeout is mentioned as 110 .


110 only shows up once on this page (which is really beside the point, thanks for giving the value), but if you want the timeout to be 420, you need to change it in the nagios.cfg. Of course, @mcapra mentioned that this will change for each check.

As far as why this is occurring now, and wasn't before, I would suspect more load on the Windows side considering that is known to be a slow snapin, but if you added more checks on the Nagios side, that could be an issue as well.
dwhitfield
Former Nagios Staff
 
Posts: 4568
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN


Return to Nagios Core

Who is online

Users browsing this forum: skypete and 19 guests