A service I wrote takes about 8 minutes to run.
It doesn't use the npre check but is just a standalone check run on a few boxes which polls some queues
over time.
I don't want to change the current setting of service_check_timeout in the cfg which is 60 seconds
to 8+ minutes. It gets time-out by the service timeout check default of 60 seconds.
What are my options other than:
+ not writing the monitor (that's not possible)
+ making it a standalone cron/at/daemon of some kind
The reasons for keeping it in Nagios are obvious.
I can't believe there is not a per-service service_check_timeout rather than only a global setting. The npre
is not needed in this case so I can't use the -t because it is running on the monitoring server itself, not remotely,
and I don't want to "loop back".
a long running service
-
- Posts: 35
- Joined: Sat Sep 25, 2010 12:53 pm
Re: a long running service
That's strange. I bumped up service_check_timeout in the cfg and restarted
but the service is still getting terminated within 60 seconds.
but the service is still getting terminated within 60 seconds.
Re: a long running service
Have you considered making the check a cron and then having its results write to the command pipe or submit a passive check another way?
Leaving nagios forked for 8+ minutes is generally a bad idea.
Leaving nagios forked for 8+ minutes is generally a bad idea.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
-
- Posts: 35
- Joined: Sat Sep 25, 2010 12:53 pm
Re: a long running service
Yes - we have.
Re: a long running service
What objections do you have to running it as a cron? This seems like the easiest way to keep the check on your box without having to edit it and add in a -t flag.smcracraft wrote:Yes - we have.
Timeout are handled usually by the plugin itself and tend to default to 10 seconds, so the global Nagios timeout of 60 is a sort of failsafe.
One other option is to make the checks passive, stick them on the remote server, and then they can have whatever timeout they need.
Former Nagios employee
-
- Posts: 35
- Joined: Sat Sep 25, 2010 12:53 pm
Re: a long running service
No objections for cron, although it is a few too many eggs in one basket, but no matter. For the remote runs, that also is
a no-no here.
In my case, I took the long-running complex monitor and simply made it record its runs in logfiles and then parse the prior runs
at an Nth attempt to measure long-stuff over longer-intervals than Nagios likes. It runs fine.
Hooray for space over time!
a no-no here.
In my case, I took the long-running complex monitor and simply made it record its runs in logfiles and then parse the prior runs
at an Nth attempt to measure long-stuff over longer-intervals than Nagios likes. It runs fine.
Hooray for space over time!
Re: a long running service
If you have similar needs in the future (monitoring an average over time, it sounds like) you can take a look at a third-party addon called bischeck:
http://assets.nagios.com/downloads/nagi ... ios-XI.pdf
It can be a bit difficult get learn the syntax at first, but once you know it you can monitor things like change in averages over time (for disk usage increase rates) or varying thresholds throughout the day.
http://assets.nagios.com/downloads/nagi ... ios-XI.pdf
It can be a bit difficult get learn the syntax at first, but once you know it you can monitor things like change in averages over time (for disk usage increase rates) or varying thresholds throughout the day.
Former Nagios employee