ndo2db Errors.

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
krw
Posts: 71
Joined: Tue May 29, 2012 2:01 pm

ndo2db Errors.

Post by krw »

I am seeing a lot of errors in /var/log/messages:

Jun 4 22:34:52 lonagiosxi ndo2db: Message sent to queue
Jun 4 22:35:00 lonagiosxi ndo2db: Error: queue send error, retrying...
Jun 4 22:35:01 lonagiosxi nagios: ndomod: Error writing to data sink! Some output may get lost...
Jun 4 22:35:01 lonagiosxi nagios: ndomod: Please check remote ndo2db log, database connection or SSL Parameters
Jun 4 22:35:05 lonagiosxi nagios: Caught SIGTERM, shutting down...
Jun 4 22:35:05 lonagiosxi nagios: Successfully shutdown... (PID=7097)
Jun 4 22:35:05 lonagiosxi nagios: ndomod: Shutdown complete.
Jun 4 22:35:05 lonagiosxi nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
Jun 4 22:35:06 lonagiosxi nagios: Nagios 3.4.1 starting... (PID=10664)
Jun 4 22:35:06 lonagiosxi nagios: Local time is Mon Jun 04 22:35:06 PDT 2012
Jun 4 22:35:06 lonagiosxi nagios: LOG VERSION: 2.0
Jun 4 22:35:06 lonagiosxi nagios: ndomod: NDOMOD 1.5.1 (05-15-2012) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Jun 4 22:35:06 lonagiosxi nagios: ndomod: Successfully connected to data sink. 69 queued items to flush.
Jun 4 22:35:06 lonagiosxi nagios: ndomod: Successfully flushed 69 queued items to data sink.

Jun 4 22:35:07 lonagiosxi ndo2db: Error: queue send error, retrying...
Jun 4 22:35:07 lonagiosxi nagios: Finished daemonizing... (New PID=10669)
Jun 4 22:35:08 lonagiosxi ndo2db: Message sent to queue
Jun 4 22:35:08 lonagiosxi ndo2db: Error: queue send error, retrying...
Jun 4 22:35:09 lonagiosxi ndo2db: Message sent to queue

Jun 4 22:35:28 lonagiosxi ndo2db: Message sent to queue
Jun 4 22:35:28 lonagiosxi nagios: Caught SIGTERM, shutting down...
Jun 4 22:35:28 lonagiosxi nagios: Successfully shutdown... (PID=10669)
Jun 4 22:35:28 lonagiosxi nagios: ndomod: Shutdown complete.
Jun 4 22:35:29 lonagiosxi nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
Jun 4 22:35:29 lonagiosxi nagios: Nagios 3.4.1 starting... (PID=10986)
Jun 4 22:35:29 lonagiosxi nagios: Local time is Mon Jun 04 22:35:29 PDT 2012
Jun 4 22:35:29 lonagiosxi nagios: LOG VERSION: 2.0
Jun 4 22:35:29 lonagiosxi nagios: ndomod: NDOMOD 1.5.1 (05-15-2012) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Jun 4 22:35:29 lonagiosxi nagios: ndomod: Successfully connected to data sink. 0 queued items to flush.
Jun 4 22:35:29 lonagiosxi nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.

Edit: Added more from log file:

Jun 6 13:51:02 lonagiosxi ndo2db: Error: queue send error, retrying...
Jun 6 13:51:03 lonagiosxi ndo2db: Message sent to queue
Jun 6 13:51:03 lonagiosxi ndo2db: Error: queue send error, retrying...
Jun 6 13:51:04 lonagiosxi ndo2db: Message sent to queue
Jun 6 13:51:05 lonagiosxi xinetd[29328]: FAIL: nrpe address from=10.2.1.116
Jun 6 13:51:05 lonagiosxi xinetd[2525]: START: nrpe pid=29328 from=10.2.1.116
Jun 6 13:51:05 lonagiosxi xinetd[2525]: EXIT: nrpe status=0 pid=29328 duration=0(sec)
Jun 6 13:51:21 lonagiosxi ndo2db: Error: queue send error, retrying...

I'm seeing a lot of queue send errors apparently from ndo2db.

I have searched the forums and see some mentions of this problem, but I'm not familiar enough
with nagios yet to know what to look for. I have repaired the nagios database in mysql, shutdown
and restarted nagios and ndo2db but the errors still appear.

Any clue here?

Thanks.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: ndo2db Errors.

Post by scottwilkerson »

I believe this was one of the items that was fixed in the new 2011R3.0 release.

http://library.nagios.com/library/produ ... -nagios-xi
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
krw
Posts: 71
Joined: Tue May 29, 2012 2:01 pm

Re: ndo2db Errors.

Post by krw »

We are on the latest:

Nagios XI 2011R3.0 Copyright © 2008-2012 Nagios Enterprises, LLC.

Clicking on Check for Updates:

Up To Date

Your installation of Nagios XI (2011R3.0) is up-to-date, so no upgrade is required. The latest version of Nagios XI is 2011R3.0, which was released on 2012-06-04.

Thanks,
Keith
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: ndo2db Errors.

Post by scottwilkerson »

Actually looking at the logs, you seem to only be getting the cannot connect to data sink right when nagios is restarting. this would be normal behavior, then once it is up and running you get the following

Code: Select all

Jun 4 22:35:06 lonagiosxi nagios: ndomod: Successfully connected to data sink. 69 queued items to flush.
Jun 4 22:35:06 lonagiosxi nagios: ndomod: Successfully flushed 69 queued items to data sink.
this is all expected behavior.

If you were getting a lot of unable to connect to datasink without the Successfully connected, that would be something to worry about.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
krw
Posts: 71
Joined: Tue May 29, 2012 2:01 pm

Re: ndo2db Errors.

Post by krw »

scottwilkerson wrote:Actually looking at the logs, you seem to only be getting the cannot connect to data sink right when nagios is restarting. this would be normal behavior, then once it is up and running you get the following

Code: Select all

Jun 4 22:35:06 lonagiosxi nagios: ndomod: Successfully connected to data sink. 69 queued items to flush.
Jun 4 22:35:06 lonagiosxi nagios: ndomod: Successfully flushed 69 queued items to data sink.
this is all expected behavior.

If you were getting a lot of unable to connect to datasink without the Successfully connected, that would be something to worry about.
tail -5 /var/log/messages:

Jun 6 14:49:28 lonagiosxi ndo2db: Message sent to queue
Jun 6 14:49:47 lonagiosxi ndo2db: Error: queue send error, retrying...
Jun 6 14:49:48 lonagiosxi ndo2db: Message sent to queue
Jun 6 14:50:17 lonagiosxi ndo2db: Error: queue send error, retrying...
Jun 6 14:50:18 lonagiosxi ndo2db: Message sent to queue

They just keep coming. I can tail -f /var/log/messages and just watch these errors appended.

Then when you try and stop sometimes I see this error:

service ndo2db stop
Stopping ndo2db: head: cannot open `/usr/local/nagios/var/ndo2db.lock' for reading: No such file or directory
done

When a stop does happen:

Jun 6 14:51:28 lonagiosxi nagios: ndomod: Error writing to data sink! Some output may get lost...
Jun 6 14:51:28 lonagiosxi nagios: ndomod: Please check remote ndo2db log, database connection or SSL Parameters

But then after a restart:

Jun 6 14:54:59 lonagiosxi nagios: ndomod: Successfully reconnected to data sink! 0 items lost, 323 queued items to flush.
Jun 6 14:54:59 lonagiosxi ndo2db: Error: queue send error, retrying...
Jun 6 14:55:00 lonagiosxi ndo2db: Message sent to queue
Jun 6 14:55:00 lonagiosxi nagios: ndomod: Successfully flushed 323 queued items to data sink.

But the only way things get flushed is after a stop and a start of ndo2db.

Otherwise I just seethe queue send errors pop up.

Thanks.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: ndo2db Errors.

Post by scottwilkerson »

You will need to increase the kernel message queue parameters for your system. I do not know the 'optimal' parameters for your system. But in my case I increased my parameters substantially (in /etc/sysctl.conf) to:

kernel.msgmax = 131072000
kernel.msgmnb = 131072000


After updating this in the conf file, run:

/sbin/sysctl -p
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
krw
Posts: 71
Joined: Tue May 29, 2012 2:01 pm

Re: ndo2db Errors.

Post by krw »

scottwilkerson wrote:You will need to increase the kernel message queue parameters for your system. I do not know the 'optimal' parameters for your system. But in my case I increased my parameters substantially (in /etc/sysctl.conf) to:

kernel.msgmax = 131072000
kernel.msgmnb = 131072000


After updating this in the conf file, run:

/sbin/sysctl -p
I checked this yesterday about 3pm and those parameters were already set like the above.
I ran sysctl -p anyway and after about 30 mins of tailing the messages log I saw that those
errors had gone away.

Checking this morning I don't see those log error messages any longer.

More RAM was added to the system the day before I noticed these log messages showing up,
would that have anything to do with it? I don't see how it would though.

Ive been 99% running BSD type machines the last ten+ years so I'm still 'learning' linux and
the differences between the two.

Thanks.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: ndo2db Errors.

Post by mguthrie »

Go ahead and keep an eye on that log, if you see any of those errors reappearing we'll dive into this further.
srosenst
Posts: 3
Joined: Thu Feb 02, 2012 5:32 pm

Re: ndo2db Errors.

Post by srosenst »

I was having the same issue and made the suggested changes. It took a restart of ndo2db for the messages to stop (I also restarted nagios, nagiosxi, npcd and httpd for good measure). Thanks!
Locked