Page 12 of 13

Re: NDO2DB Issue out of the blue

Posted: Fri Aug 28, 2015 8:25 pm
by jfrickson
rseiwert wrote:Bandit, I did see this behavior in the unpatched version as well but Nagios XI also choking in the exact same way you were experiencing. When it was borderline XI would work past it's choke state and then I would notice the invalid results.
So the bogus check data is not new with my patch? That will make a big difference in where I look for that.

Does the bogus data ever work itself out, or is it fixed only by restarting ndo2db?

Re: NDO2DB Issue out of the blue

Posted: Sat Aug 29, 2015 9:55 am
by rseiwert
Again, educated guesses here. It is related to oversized results. The bogus data persists until the result size reduces to normal. Right now I'm using brute force to make it crash but with some effort I,m sure we could figure out the breaking point. Where to look, you might already have enhanced the ndo2db debug logging to show but in my mind there is no such thing as to much debug logging.

As soon as I change my test back to listing only the errors in the last hour the results return to normal.

Re: NDO2DB Issue out of the blue

Posted: Mon Aug 31, 2015 12:15 pm
by tmcdonald
BanditBBS wrote:Well, this was the first night all week that it hasn't crashed 2-3 time between 10pm and 8:10am. I have high hopes, but not calling this completed/fixed until it goes the weekend with no issues as well....but looking good :)
Are we still looking good? I know @rseiwert and @jfrickson are discussing some possible related issues, but I wanted to check on yours.

Re: NDO2DB Issue out of the blue

Posted: Mon Aug 31, 2015 12:50 pm
by BanditBBS
tmcdonald wrote:
BanditBBS wrote:Well, this was the first night all week that it hasn't crashed 2-3 time between 10pm and 8:10am. I have high hopes, but not calling this completed/fixed until it goes the weekend with no issues as well....but looking good :)
Are we still looking good? I know @rseiwert and @jfrickson are discussing some possible related issues, but I wanted to check on yours.
Yeah Trevor, I think the issue can be marked closed and maybe this fixes the weird issue so many others sometimes have had with ndo2db. Feel free to keep this open so they can finish the discussion related to the other bug.

Re: NDO2DB Issue out of the blue

Posted: Mon Aug 31, 2015 12:59 pm
by jfrickson
rseiwert wrote:Again, educated guesses here. It is related to oversized results. The bogus data persists until the result size reduces to normal.
The attached patch does away with all realloc()s and calloc()s by limiting output to ~64K, so if there's a memory corruption issue, this might take care of it. Try this when you get a chance and let us know if your bogus data issue is still there or not.

Re: NDO2DB Issue out of the blue

Posted: Tue Sep 01, 2015 12:32 pm
by rseiwert

Code: Select all

--- ndo2db.c.orig	2015-08-31 12:20:39.433892447 -0500
+++ ndo2db.c	2015-08-31 13:02:08.089690711 -0500
Is this a patch to patch or to the original original? The time date stamps have me worried.

Re: NDO2DB Issue out of the blue

Posted: Tue Sep 01, 2015 12:46 pm
by jfrickson
rseiwert wrote:Is this a patch to patch or to the original original? The time date stamps have me worried.
Patch to the original.

Re: NDO2DB Issue out of the blue

Posted: Tue Sep 01, 2015 4:50 pm
by rseiwert
I can still get critical results to show green by overloading the results. This doesn't matter that much to me. In my case the problem was checkwmiplus which has added a flag to limit the output in version 1.6. It is still my humble opinion that a run away critical event with to much to say should not go green. I do agree that checks should not be returning 1/2 meg worth of results in a perfect world but, if it was a perfect world, we would not need monitoring.
Check WMI Plus Version 1.6
•Added --forcetruncateoutput so you can restrict the maximum length of the plugin output. Does not affect debug mode. Default value set to 8192 bytes.

Re: NDO2DB Issue out of the blue

Posted: Wed Sep 02, 2015 2:33 pm
by tmcdonald
It's going to take some time, but I definitely agree that this is not expected or desirable behavior. It's a side-effect of the testing process, and I can't imagine we will call this bug squashed until the overflow issue is resolved. Like the heads of a hydra, sometimes when fixing one bug you create another. 'tis the nature of the beast.

Re: NDO2DB Issue out of the blue

Posted: Thu Sep 03, 2015 11:56 am
by jfrickson
rseiwert wrote:I can still get critical results to show green by overloading the results. This doesn't matter that much to me. In my case the problem was checkwmiplus which has added a flag to limit the output in version 1.6. It is still my humble opinion that a run away critical event with to much to say should not go green. I do agree that checks should not be returning 1/2 meg worth of results in a perfect world but, if it was a perfect world, we would not need monitoring.
When you get a chance, apply the attached patch to the original source. It adds some debugging info to the ndo2db.debug file. Then change the ndo2db.cfg file. Set debug_level=-1 and debug_verbosity=2. Maybe bump up the size of max_debug_file_size while you're in there.

When you get critical results to show green, turn off debugging: debug_level=0 and send me the output. Hopefully that will tell me where the problem lies.