Page 3 of 3

Re: Check did not exit properly / Failed to register iobroke

Posted: Wed Sep 13, 2017 2:32 am
by alexpeeters
@dwhitfield thank you.

The io-wait times in top are 0, I guess a ramdisk would not be able to lower that?
Since we're on Nagios Core the system isn't using ndo2db out of the box, and ours is currently not using that module.

Is there any way to let Nagios handle this error? Just by discarding the check alltogether, and reschedule a new check according to the retry interval? Or could there be a beter strategy?

Re: Check did not exit properly / Failed to register iobroke

Posted: Wed Sep 13, 2017 4:54 pm
by dwhitfield
Yeah, I wouldn't have gone to ndo2db, except you brought up operating system limits and that's where we have our limited documentation on that.

The ramdisk was just a different thought about resource limitation.

How many hosts and services do you have? Doing a top or whatever will show us a point in time, but the hosts and services count will give us some idea of the type of load the server could be at at max.

Re: Check did not exit properly / Failed to register iobroke

Posted: Thu Sep 14, 2017 2:32 am
by alexpeeters
Maybe you can an idea of our workload based on the performance information?

Re: Check did not exit properly / Failed to register iobroke

Posted: Thu Sep 14, 2017 1:51 pm
by bheden
Eeesh. I was hoping for way more stuff in there. That isn't a lot at all.

This may sound weird, but are you able to maybe compile the same version on say CentOS or Debian or something else? And migrate your entire infrastructure over to it and see if the bug persists?

We may have to move this to a github issue. It could very well be a bug, but there's still quite a lot of information I don't have.

Re: Check did not exit properly / Failed to register iobroke

Posted: Fri Sep 15, 2017 2:34 am
by alexpeeters
Great idea! I'm curious if the bug persists on another OS. The weekend would be a great test.

I have chosen Centos 7 x64 for this test. I'll get on it right away.

Update 14:15 CEST: It's not easy, lot's of checks are broken because of NRPE version conflicts, missing perl modules and what not. I'm fixing one by one. Meanwhile Nagios is running, and the dreaded error has not shown itself yet.

Re: Check did not exit properly / Failed to register iobroke

Posted: Fri Sep 15, 2017 3:26 pm
by dwhitfield
alexpeeters wrote:
This Nagios instance has 2852 services on 358 hosts to check.
Found the above going back through. If you do want to move this to github, here's the link: https://github.com/NagiosEnterprises/na ... issues/new

I think it is clear due to the discussion about 4.3.3, but this install has always been 4.3.4 until the 4.3.3 test, correct?

Re: Check did not exit properly / Failed to register iobroke

Posted: Mon Sep 18, 2017 2:25 am
by alexpeeters
We were on 4.3.3 only for a day or two, just for the test. I moved us back to 4.3.4 after the conclusion was drawn.

I'm not familiar with the standard procedures, why would I want to move this to Github? I will of course, if required to get to a solution ;)

Edit: The setup has been running on CentOS the entire weekend without the error in the logfiles. I have destroyed the VM and I'm now building another on SUSE Linux Enterprise Server 12 SP 3 (x86_64) instead of the earlier openSUSE 42.3 (x86_64) which had the problem. I'm curious if the paid version will show the error.

Re: Check did not exit properly / Failed to register iobroke

Posted: Mon Sep 18, 2017 4:21 pm
by bheden
The reason for moving it to github is purely related to the scope of the problem.

This is pretty definitively a bug, and there isn't much more that can be done on the support forum.

You can open an issue here: https://github.com/NagiosEnterprises/nagioscore/issues

Re: Check did not exit properly / Failed to register iobroke

Posted: Tue Sep 19, 2017 2:24 am
by alexpeeters
We, the Dutch, would say Now breaks my wooden shoe!
I am very surprised, yesterday I installed the server on SLES12 SP3 x64, and the feared error did not show itself all night.

My assumption is that the bug is confined to openSUSE 42.3 (x86_64) only.

To be honest, for now my problem is solved. But since I have been able to reproduce the bug on two openSUSE 42.3 (x86_64) machines (of which the latest has the same install as our SLES-box) I moved this to Github: https://github.com/NagiosEnterprises/na ... issues/433

Re: Check did not exit properly / Failed to register iobroke

Posted: Tue Sep 19, 2017 6:45 am
by dwhitfield
Locking this up to move discussion to github.