Actually he just got back to me that this is resolved.
https://github.com/NagiosEnterprises/na ... ca3b89a8c6
One thing to note, is that core gets very angry about time changes, as it severely messes with the internal scheduler, especially for tasks that have already been scheduled but not run. There really isn't a good way to resolve this without a large rewrite, and is not really considered a use case of core. A better alternative is to use an agent on those systems to push passive checks back to your main instance. This would resolve all sleep issues.
Nagios 4 Load issues - OS X 10.9
-
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: Nagios 4 Load issues - OS X 10.9
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Re: Nagios 4 Load issues - OS X 10.9
Thanks!
The issue we're seeing is on single nodes -- Nagios running on OS X (on a dev laptop) checking services only on the same node -- e.g. all localhost.
The issue we're seeing is on single nodes -- Nagios running on OS X (on a dev laptop) checking services only on the same node -- e.g. all localhost.
Re: Nagios 4 Load issues - OS X 10.9
You can open issues or send pull requests to us on GitHub: https://github.com/NagiosEnterprises/nagioscorejpotter wrote:Thanks! I'm happy to send in a bug report / patch, just didn't see where to do it.
We also have our Mantis bug-tracker at http://tracker.nagios.org
Hate mail can go to emislivec@nagios.com
The forums here are a good place to start since we do have multiple projects/products, not all issues are strictly problems with the code, and some issues may have already been resolved in a newer version.
Let us know if you have other pain points on OS X and we can start working to get those resolved.
Re: Nagios 4 Load issues - OS X 10.9
Adding a hook to stop Nagios on sleep and start it on wake could be a good solution. It's not running checks while sleeping anyway, and a clean start/stop would prevent time related issues like this, or others like file ages going off. I'm not sure how to implement this on modern OS X though.jpotter wrote:The issue we're seeing is on single nodes -- Nagios running on OS X (on a dev laptop) checking services only on the same node -- e.g. all localhost.
Re: Nagios 4 Load issues - OS X 10.9
Hmm. OS X API stuff is above my pay grade, but presumably there must be an API for registering for a call-back for system suspension/sleep. Seems like overkill.
I don't think the underlying bug is caused by the system having been in hibernation, just that resuming from sleep happens to trigger a condition. Just a wild-ass-guess, but this seems like a parent process/thread spinning on reading from something that's closed / EOF, e.g. where the reader is trying to read until it gets something but failing to do an EOF check on the file handle.
-J
I don't think the underlying bug is caused by the system having been in hibernation, just that resuming from sleep happens to trigger a condition. Just a wild-ass-guess, but this seems like a parent process/thread spinning on reading from something that's closed / EOF, e.g. where the reader is trying to read until it gets something but failing to do an EOF check on the file handle.
-J
Re: Nagios 4 Load issues - OS X 10.9
It looks like there are some tools out there, SleepWatcher seems to be a popular recommendation. The general idea isn't that different than startup/shutdown, just need a wayo to get the init scripts to run on sleep/wake.jpotter wrote:Hmm. OS X API stuff is above my pay grade, but presumably there must be an API for registering for a call-back for system suspension/sleep. Seems like overkill.
There are a few issues from the time change when waking (timeouts on running jobs, timers, waiting for I/O), and thejpotter wrote:I don't think the underlying bug is caused by the system having been in hibernation, just that resuming from sleep happens to trigger a condition. Just a wild-ass-guess, but this seems like a parent process/thread spinning on reading from something that's closed / EOF, e.g. where the reader is trying to read until it gets something but failing to do an EOF check on the file handle.
Code: Select all
read(0x12, "\0", 0x1000) = -1 Err#35
How long are you putting the machines to sleep? Do you see this behavior consistently when waking?
Re: Nagios 4 Load issues - OS X 10.9
Thanks for the suggestion of SleepWatcher -- hadn't heard of it before; looks like a nifty tool for side-stepping the problem.
Re: sleep duration, our developers are typically seeing it when they close the laptop and go home or come back into the office -- so at least a ~20 minute sleep period, but presumably sometimes overnight. I've been able to reproduce it, but not 100% of the time. I'll see if I can get more data and report back.
Re: sleep duration, our developers are typically seeing it when they close the laptop and go home or come back into the office -- so at least a ~20 minute sleep period, but presumably sometimes overnight. I've been able to reproduce it, but not 100% of the time. I'll see if I can get more data and report back.
Re: Nagios 4 Load issues - OS X 10.9
Sure. Let us know when you have more info.I've been able to reproduce it, but not 100% of the time. I'll see if I can get more data and report back.
Be sure to check out our Knowledgebase for helpful articles and solutions!