Service Check Timed Out

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
plakshmi
Posts: 68
Joined: Thu Aug 30, 2012 12:32 pm

Service Check Timed Out

Post by plakshmi »

We are frequently getting "Service Check Timed Out" alert for DB checks. Please let us know how to resolve this issue.

***** Nagios Monitor XI Alert *****

Notification Type: PROBLEM

Service: TSPACE - XXXXX
Host: XXXXXXXXXXX
Address: XXXXXXXX
State: CRITICAL

Date/Time: Mon Dec 01 07:20:51 EST 2013

Additional Info:

(Service Check Timed Out)
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Service Check Timed Out

Post by abrist »

Can you run the checks from the cli?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
plakshmi
Posts: 68
Joined: Thu Aug 30, 2012 12:32 pm

Re: Service Check Timed Out

Post by plakshmi »

We are able to run checks through CLI. The service throws the "service checked timeout" error for a small duration of time like 4-5 mins and then again going to its normal state. We are observing that the errors are occurring quite frequently. Please help us in getting this resolved.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Service Check Timed Out

Post by tmcdonald »

Can you post an example of the command you run and its output, wrapped in code tags?
Former Nagios employee
plakshmi
Posts: 68
Joined: Thu Aug 30, 2012 12:32 pm

Re: Service Check Timed Out

Post by plakshmi »

Below is the command.Did not gave actual argument values here intentionally.

$USER1$/check_oracle_basic --tablespace $ARG1$ $ARG2$ $ARG3$ $USER40$ $USER41$


Here is the output

01-30-2014 02:21:06 01-30-2014 02:30:26 0d 0h 9m 20s SERVICE CRITICAL (HARD) (Service Check Timed Out)
01-30-2014 02:30:26 01-30-2014 02:56:06 0d 0h 25m 40s SERVICE WARNING (HARD) WARNING - Oracle tablespace TBS_FTD_2011_Q4 - 95% used [ 451 of 8900 MB available ]
01-30-2014 02:56:06 01-30-2014 03:00:08 0d 0h 4m 2s SERVICE CRITICAL (HARD) (Service Check Timed Out)
455157
Posts: 47
Joined: Mon Sep 10, 2012 7:35 pm

Re: Service Check Timed Out

Post by 455157 »

I wonder:

How is the health of the machine hosting the DB, and the network connections between your XI server and the DB?

If you goto Reports -> Alert Heatmap, for example, do you see other, broader problems happening alongside the periods when the checks time out?

Or, are any of the other checks on the DB server/VM (beyond the tablespace checks) timing out, or in problem states?

Just a couple of thoughts, I'm probably on the wrong track.
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Service Check Timed Out

Post by sreinhardt »

I would tend to agree, the network health, and potentially oracle server health may be the bigger issue. Also, how long is your timeout currently configured for with that plugin?
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
plakshmi
Posts: 68
Joined: Thu Aug 30, 2012 12:32 pm

Re: Service Check Timed Out

Post by plakshmi »

We are unable to open heatmaps for this server.Timeout is set to 60 sec..

Beyond db checks it happens for disks too randomly. Also we have around 90 services monitored for this server where we see this issue regularly for tspace.

We observed that it goes to warning state and then changes to service checked timeout as shown below.


02-02-2014 20:04:31 02-03-2014 00:00:00 0d 3h 55m 29s SERVICE WARNING (HARD) WARNING - Oracle tablespace TBS_FTD_2011_Q4 - 95% used [ 451 of 8900 MB available ]
02-03-2014 00:00:00 02-03-2014 02:15:33 0d 2h 15m 33s SERVICE WARNING (HARD) WARNING - Oracle tablespace TBS_FTD_2011_Q4 - 95% used [ 451 of 8900 MB available ]
02-03-2014 02:15:33 02-03-2014 02:34:53 0d 0h 19m 20s SERVICE CRITICAL (HARD) (Service Check Timed Out)
02-03-2014 02:34:53 02-03-2014 06:52:31 0d 4h 17m 38s SERVICE WARNING (HARD) WARNING - Oracle tablespace TBS_FTD_2011_Q4 - 95% used [ 451 of 8900 MB available ]
02-03-2014 17:35:38 02-03-2014 17:39:45 0d 0h 4m 7s SERVICE CRITICAL (HARD) (Service Check Timed Out)
02-03-2014 17:39:45 02-03-2014 20:00:37 0d 2h 20m 52s SERVICE WARNING (HARD) WARNING - Oracle tablespace TBS_FTD_2011_Q4 - 95% used [ 451 of 8900 MB available ]
02-03-2014 20:00:37 02-03-2014 20:04:48 0d 0h 4m 11s SERVICE CRITICAL (HARD) (Service Check Timed Out)
02-03-2014 20:04:48 02-04-2014 00:00:00 0d 3h 55m 12s SERVICE WARNING (HARD) WARNING - Oracle tablespace TBS_FTD_2011_Q4 - 95% used [ 451 of 8900 MB available ]
02-04-2014 00:00:00 02-04-2014 00:31:19 0d 0h 31m 19s SERVICE WARNING (HARD) WARNING - Oracle tablespace TBS_FTD_2011_Q4 - 95% used [ 451 of 8900 MB available
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Service Check Timed Out

Post by slansing »

Do you have some sort of query limiting set for the user / listener you are using in your oracle checks? That would be set on the oracle system itself.
reinaldo.gomes
Posts: 59
Joined: Wed Apr 02, 2014 9:29 am

Re: Service Check Timed Out

Post by reinaldo.gomes »

Where do I set the timeout value for this message?: "(Service Check Timed Out)"
Also, how can I set a generic timeout in perl?
Locked