Search found 13 matches
- Thu Apr 02, 2015 3:47 pm
- Forum: Open Source Nagios Projects
- Topic: Invalid Freshness Alarms
- Replies: 23
- Views: 9779
Re: Invalid Freshness Alarms
Sure, my problem is fixed since we're not using the command pipe anymore. I still think it would be good to patch nagios to fix this command pipe behavior.
- Mon Mar 30, 2015 1:46 pm
- Forum: Open Source Nagios Projects
- Topic: Invalid Freshness Alarms
- Replies: 23
- Views: 9779
Re: Invalid Freshness Alarms
I just wanted to post an update. We have been running since Thursday using the new check_results strategy and have not seen any invalid alarms. It seems that the issue was related to that command pipe corruption that I found. I think it would be prudent to clear out the contents of the buf variable ...
- Thu Mar 26, 2015 11:23 am
- Forum: Open Source Nagios Projects
- Topic: Invalid Freshness Alarms
- Replies: 23
- Views: 9779
Re: Invalid Freshness Alarms
After some spelunking in the nagios source, I think I have discovered how this was happening, if not necessarily why. I set debug_verbosity=2, which adds "Raw command entry:" logging. From the source, nagios logs Raw Commands before it processes any of the command entries. I noticed that t...
- Wed Mar 25, 2015 9:54 am
- Forum: Open Source Nagios Projects
- Topic: Invalid Freshness Alarms
- Replies: 23
- Views: 9779
Re: Invalid Freshness Alarms
We do have NTP configured and monitored with the built-in check_mk agent checks. None of the hosts are off more than a few milliseconds. Even if they were, that would not explain checks that have a 0 timestamp or the repeated checks.
- Mon Mar 23, 2015 12:45 pm
- Forum: Open Source Nagios Projects
- Topic: Invalid Freshness Alarms
- Replies: 23
- Views: 9779
Re: Invalid Freshness Alarms
I have modified my script to tail the nagios.debug log and watch for old ( > 150 second ) commands being pushed in. It has turned up lots of stuff : [1427128588.816659] [128.1] [pid=9556] Command Entry Time: 1427128385 large diff : 203 - 1427128588 1427128385 [1427131034.299819] [128.1] [pid=9556] C...
- Mon Mar 23, 2015 10:37 am
- Forum: Open Source Nagios Projects
- Topic: Invalid Freshness Alarms
- Replies: 23
- Views: 9779
Re: Invalid Freshness Alarms
I have been keeping an eye on stuff in check_result_path. There is generally nothing in there that's more than a minute or so old. A few of the servers did have some very old files - weeks or months old. I suspect these were from crashes - we previously had issues with livestatus causing nagios to c...
- Fri Mar 20, 2015 4:23 pm
- Forum: Open Source Nagios Projects
- Topic: Invalid Freshness Alarms
- Replies: 23
- Views: 9779
Re: Invalid Freshness Alarms
And one more note - we have the default max_check_result_file_age value (3600). Shouldn't this prevent these really old check results from being processed?
- Fri Mar 20, 2015 3:44 pm
- Forum: Open Source Nagios Projects
- Topic: Invalid Freshness Alarms
- Replies: 23
- Views: 9779
Re: Invalid Freshness Alarms
One more note on the 12-hour old command - it does not show up in nagios.debug at the time that it was sent in by gostatus. So it's not as if it is an old result that got accepted once and then replayed somehow. It appears to have been submitted, then sat on the cmd file for 12+ hours, then picked u...
- Fri Mar 20, 2015 3:20 pm
- Forum: Open Source Nagios Projects
- Topic: Invalid Freshness Alarms
- Replies: 23
- Views: 9779
Re: Invalid Freshness Alarms
Ok, got another alarm today and got a bunch of very interesting info. https://gist.githubusercontent.com/enichols/b298954839ed81eb913b/raw/8ccfc6a5cfb75e3257d33926da4a2f55e0370935/gistfile1.txt As you can see from the nagios.debug log, somehow gostatus pushed a command 12 hours ago and it just showe...
- Thu Mar 19, 2015 11:15 am
- Forum: Open Source Nagios Projects
- Topic: Invalid Freshness Alarms
- Replies: 23
- Views: 9779
Re: Invalid Freshness Alarms
FYI, we had another bogus alarm last night, so the ramdisk did not help. Unfortunately, that server did not have a nagios debug_level set, so I don't have full debugging info, just cron, nagios.log and gostatus logs : https://gist.githubusercontent.com/enichols/899822f3f56df77f448b/raw/abc697328ad08...