Page 1 of 1

ndo2db Errors.

Posted: Wed Jun 06, 2012 3:50 pm
by krw
I am seeing a lot of errors in /var/log/messages:

Jun 4 22:34:52 lonagiosxi ndo2db: Message sent to queue
Jun 4 22:35:00 lonagiosxi ndo2db: Error: queue send error, retrying...
Jun 4 22:35:01 lonagiosxi nagios: ndomod: Error writing to data sink! Some output may get lost...
Jun 4 22:35:01 lonagiosxi nagios: ndomod: Please check remote ndo2db log, database connection or SSL Parameters
Jun 4 22:35:05 lonagiosxi nagios: Caught SIGTERM, shutting down...
Jun 4 22:35:05 lonagiosxi nagios: Successfully shutdown... (PID=7097)
Jun 4 22:35:05 lonagiosxi nagios: ndomod: Shutdown complete.
Jun 4 22:35:05 lonagiosxi nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
Jun 4 22:35:06 lonagiosxi nagios: Nagios 3.4.1 starting... (PID=10664)
Jun 4 22:35:06 lonagiosxi nagios: Local time is Mon Jun 04 22:35:06 PDT 2012
Jun 4 22:35:06 lonagiosxi nagios: LOG VERSION: 2.0
Jun 4 22:35:06 lonagiosxi nagios: ndomod: NDOMOD 1.5.1 (05-15-2012) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Jun 4 22:35:06 lonagiosxi nagios: ndomod: Successfully connected to data sink. 69 queued items to flush.
Jun 4 22:35:06 lonagiosxi nagios: ndomod: Successfully flushed 69 queued items to data sink.

Jun 4 22:35:07 lonagiosxi ndo2db: Error: queue send error, retrying...
Jun 4 22:35:07 lonagiosxi nagios: Finished daemonizing... (New PID=10669)
Jun 4 22:35:08 lonagiosxi ndo2db: Message sent to queue
Jun 4 22:35:08 lonagiosxi ndo2db: Error: queue send error, retrying...
Jun 4 22:35:09 lonagiosxi ndo2db: Message sent to queue

Jun 4 22:35:28 lonagiosxi ndo2db: Message sent to queue
Jun 4 22:35:28 lonagiosxi nagios: Caught SIGTERM, shutting down...
Jun 4 22:35:28 lonagiosxi nagios: Successfully shutdown... (PID=10669)
Jun 4 22:35:28 lonagiosxi nagios: ndomod: Shutdown complete.
Jun 4 22:35:29 lonagiosxi nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
Jun 4 22:35:29 lonagiosxi nagios: Nagios 3.4.1 starting... (PID=10986)
Jun 4 22:35:29 lonagiosxi nagios: Local time is Mon Jun 04 22:35:29 PDT 2012
Jun 4 22:35:29 lonagiosxi nagios: LOG VERSION: 2.0
Jun 4 22:35:29 lonagiosxi nagios: ndomod: NDOMOD 1.5.1 (05-15-2012) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Jun 4 22:35:29 lonagiosxi nagios: ndomod: Successfully connected to data sink. 0 queued items to flush.
Jun 4 22:35:29 lonagiosxi nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.

Edit: Added more from log file:

Jun 6 13:51:02 lonagiosxi ndo2db: Error: queue send error, retrying...
Jun 6 13:51:03 lonagiosxi ndo2db: Message sent to queue
Jun 6 13:51:03 lonagiosxi ndo2db: Error: queue send error, retrying...
Jun 6 13:51:04 lonagiosxi ndo2db: Message sent to queue
Jun 6 13:51:05 lonagiosxi xinetd[29328]: FAIL: nrpe address from=10.2.1.116
Jun 6 13:51:05 lonagiosxi xinetd[2525]: START: nrpe pid=29328 from=10.2.1.116
Jun 6 13:51:05 lonagiosxi xinetd[2525]: EXIT: nrpe status=0 pid=29328 duration=0(sec)
Jun 6 13:51:21 lonagiosxi ndo2db: Error: queue send error, retrying...

I'm seeing a lot of queue send errors apparently from ndo2db.

I have searched the forums and see some mentions of this problem, but I'm not familiar enough
with nagios yet to know what to look for. I have repaired the nagios database in mysql, shutdown
and restarted nagios and ndo2db but the errors still appear.

Any clue here?

Thanks.

Re: ndo2db Errors.

Posted: Wed Jun 06, 2012 4:14 pm
by scottwilkerson
I believe this was one of the items that was fixed in the new 2011R3.0 release.

http://library.nagios.com/library/produ ... -nagios-xi

Re: ndo2db Errors.

Posted: Wed Jun 06, 2012 4:20 pm
by krw
We are on the latest:

Nagios XI 2011R3.0 Copyright © 2008-2012 Nagios Enterprises, LLC.

Clicking on Check for Updates:

Up To Date

Your installation of Nagios XI (2011R3.0) is up-to-date, so no upgrade is required. The latest version of Nagios XI is 2011R3.0, which was released on 2012-06-04.

Thanks,
Keith

Re: ndo2db Errors.

Posted: Wed Jun 06, 2012 4:30 pm
by scottwilkerson
Actually looking at the logs, you seem to only be getting the cannot connect to data sink right when nagios is restarting. this would be normal behavior, then once it is up and running you get the following

Code: Select all

Jun 4 22:35:06 lonagiosxi nagios: ndomod: Successfully connected to data sink. 69 queued items to flush.
Jun 4 22:35:06 lonagiosxi nagios: ndomod: Successfully flushed 69 queued items to data sink.
this is all expected behavior.

If you were getting a lot of unable to connect to datasink without the Successfully connected, that would be something to worry about.

Re: ndo2db Errors.

Posted: Wed Jun 06, 2012 5:00 pm
by krw
scottwilkerson wrote:Actually looking at the logs, you seem to only be getting the cannot connect to data sink right when nagios is restarting. this would be normal behavior, then once it is up and running you get the following

Code: Select all

Jun 4 22:35:06 lonagiosxi nagios: ndomod: Successfully connected to data sink. 69 queued items to flush.
Jun 4 22:35:06 lonagiosxi nagios: ndomod: Successfully flushed 69 queued items to data sink.
this is all expected behavior.

If you were getting a lot of unable to connect to datasink without the Successfully connected, that would be something to worry about.
tail -5 /var/log/messages:

Jun 6 14:49:28 lonagiosxi ndo2db: Message sent to queue
Jun 6 14:49:47 lonagiosxi ndo2db: Error: queue send error, retrying...
Jun 6 14:49:48 lonagiosxi ndo2db: Message sent to queue
Jun 6 14:50:17 lonagiosxi ndo2db: Error: queue send error, retrying...
Jun 6 14:50:18 lonagiosxi ndo2db: Message sent to queue

They just keep coming. I can tail -f /var/log/messages and just watch these errors appended.

Then when you try and stop sometimes I see this error:

service ndo2db stop
Stopping ndo2db: head: cannot open `/usr/local/nagios/var/ndo2db.lock' for reading: No such file or directory
done

When a stop does happen:

Jun 6 14:51:28 lonagiosxi nagios: ndomod: Error writing to data sink! Some output may get lost...
Jun 6 14:51:28 lonagiosxi nagios: ndomod: Please check remote ndo2db log, database connection or SSL Parameters

But then after a restart:

Jun 6 14:54:59 lonagiosxi nagios: ndomod: Successfully reconnected to data sink! 0 items lost, 323 queued items to flush.
Jun 6 14:54:59 lonagiosxi ndo2db: Error: queue send error, retrying...
Jun 6 14:55:00 lonagiosxi ndo2db: Message sent to queue
Jun 6 14:55:00 lonagiosxi nagios: ndomod: Successfully flushed 323 queued items to data sink.

But the only way things get flushed is after a stop and a start of ndo2db.

Otherwise I just seethe queue send errors pop up.

Thanks.

Re: ndo2db Errors.

Posted: Thu Jun 07, 2012 11:26 am
by scottwilkerson
You will need to increase the kernel message queue parameters for your system. I do not know the 'optimal' parameters for your system. But in my case I increased my parameters substantially (in /etc/sysctl.conf) to:

kernel.msgmax = 131072000
kernel.msgmnb = 131072000


After updating this in the conf file, run:

/sbin/sysctl -p

Re: ndo2db Errors.

Posted: Fri Jun 08, 2012 11:44 am
by krw
scottwilkerson wrote:You will need to increase the kernel message queue parameters for your system. I do not know the 'optimal' parameters for your system. But in my case I increased my parameters substantially (in /etc/sysctl.conf) to:

kernel.msgmax = 131072000
kernel.msgmnb = 131072000


After updating this in the conf file, run:

/sbin/sysctl -p
I checked this yesterday about 3pm and those parameters were already set like the above.
I ran sysctl -p anyway and after about 30 mins of tailing the messages log I saw that those
errors had gone away.

Checking this morning I don't see those log error messages any longer.

More RAM was added to the system the day before I noticed these log messages showing up,
would that have anything to do with it? I don't see how it would though.

Ive been 99% running BSD type machines the last ten+ years so I'm still 'learning' linux and
the differences between the two.

Thanks.

Re: ndo2db Errors.

Posted: Fri Jun 08, 2012 1:51 pm
by mguthrie
Go ahead and keep an eye on that log, if you see any of those errors reappearing we'll dive into this further.

Re: ndo2db Errors.

Posted: Thu Jun 14, 2012 10:09 am
by srosenst
I was having the same issue and made the suggested changes. It took a restart of ndo2db for the messages to stop (I also restarted nagios, nagiosxi, npcd and httpd for good measure). Thanks!