a long running service

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
smcracraft
Posts: 35
Joined: Sat Sep 25, 2010 12:53 pm

a long running service

Post by smcracraft »

A service I wrote takes about 8 minutes to run.

It doesn't use the npre check but is just a standalone check run on a few boxes which polls some queues
over time.

I don't want to change the current setting of service_check_timeout in the cfg which is 60 seconds
to 8+ minutes. It gets time-out by the service timeout check default of 60 seconds.

What are my options other than:

+ not writing the monitor (that's not possible)
+ making it a standalone cron/at/daemon of some kind

The reasons for keeping it in Nagios are obvious.

I can't believe there is not a per-service service_check_timeout rather than only a global setting. The npre
is not needed in this case so I can't use the -t because it is running on the monitoring server itself, not remotely,
and I don't want to "loop back".
smcracraft
Posts: 35
Joined: Sat Sep 25, 2010 12:53 pm

Re: a long running service

Post by smcracraft »

That's strange. I bumped up service_check_timeout in the cfg and restarted
but the service is still getting terminated within 60 seconds.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: a long running service

Post by abrist »

Have you considered making the check a cron and then having its results write to the command pipe or submit a passive check another way?

Leaving nagios forked for 8+ minutes is generally a bad idea.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
smcracraft
Posts: 35
Joined: Sat Sep 25, 2010 12:53 pm

Re: a long running service

Post by smcracraft »

Yes - we have.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: a long running service

Post by tmcdonald »

smcracraft wrote:Yes - we have.
What objections do you have to running it as a cron? This seems like the easiest way to keep the check on your box without having to edit it and add in a -t flag.

Timeout are handled usually by the plugin itself and tend to default to 10 seconds, so the global Nagios timeout of 60 is a sort of failsafe.

One other option is to make the checks passive, stick them on the remote server, and then they can have whatever timeout they need.
Former Nagios employee
smcracraft
Posts: 35
Joined: Sat Sep 25, 2010 12:53 pm

Re: a long running service

Post by smcracraft »

No objections for cron, although it is a few too many eggs in one basket, but no matter. For the remote runs, that also is
a no-no here.

In my case, I took the long-running complex monitor and simply made it record its runs in logfiles and then parse the prior runs
at an Nth attempt to measure long-stuff over longer-intervals than Nagios likes. It runs fine.

Hooray for space over time!
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: a long running service

Post by tmcdonald »

If you have similar needs in the future (monitoring an average over time, it sounds like) you can take a look at a third-party addon called bischeck:

http://assets.nagios.com/downloads/nagi ... ios-XI.pdf

It can be a bit difficult get learn the syntax at first, but once you know it you can monitor things like change in averages over time (for disk usage increase rates) or varying thresholds throughout the day.
Former Nagios employee
Locked