NDO2DB errors in messges (queue recv error)

CFT6Server · Post by **CFT6Server** » Thu Jul 09, 2015 12:10 pm

checking the error logs and found a lot of these entries....

Jul  9 10:08:55 kdcnagxi01 ndo2db: Error: queue recv error.
Jul  9 10:08:55 kdcnagxi01 ndo2db: Error: queue recv error.
Jul  9 10:08:55 kdcnagxi01 ndo2db: Error: queue recv error.
Jul  9 10:08:55 kdcnagxi01 ndo2db: Error: queue recv error.
Jul  9 10:08:55 kdcnagxi01 ndo2db: Error: queue recv error.
Jul  9 10:08:55 kdcnagxi01 ndo2db: Error: queue recv error.
Jul  9 10:08:55 kdcnagxi01 ndo2db: Error: queue recv error.
Jul  9 10:08:55 kdcnagxi01 ndo2db: Error: queue recv error.
Jul  9 10:08:55 kdcnagxi01 ndo2db: Error: queue recv error.

Looking at top and ndo2db is pretty much pinned at over 90%. What are these and should I be concerned?

tmcdonald · Post by **tmcdonald** » Thu Jul 09, 2015 12:25 pm

What ndo and core versions are you running?

Code: Select all

/usr/local/nagios/bin/ndo2db --version
/usr/local/nagios/bin/nagios --version

CFT6Server · Post by **CFT6Server** » Thu Jul 09, 2015 1:31 pm

Code: Select all

NDO2DB 2.0.0
Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Copyright (c) 2005-2008 Ethan Galstad
Last Modified: 02-28-2014
License: GPL v2

Code: Select all

Nagios Core 4.0.8
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-12-2014
License: GPL

tmcdonald · Post by **tmcdonald** » Thu Jul 09, 2015 2:13 pm

Pretty standard. Can you run the following please and post the output?

Code: Select all

ipcs -q

In addition, have you noticed any issues with your XI server aside from these messages? Notifications not being sent, check statuses not being updated, etc?

CFT6Server · Post by **CFT6Server** » Thu Jul 09, 2015 3:13 pm

Code: Select all

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages
0x6d000002 7929856    nagios     600        0            0

Not that I know if... I checked around due to ndo2db using most of the cpu....

Code: Select all

Cpu(s): 33.1%us, 15.8%sy,  0.0%ni, 49.1%id,  1.7%wa,  0.0%hi,  0.3%si,  0.0%st
Mem:   5992380k total,  4889040k used,  1103340k free,    54108k buffers
Swap:  2064380k total,   153756k used,  1910624k free,  1463940k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
29570 nagios    20   0 55676 2980  868 R 93.3  0.0   9318:08 ndo2db
 1452 root      20   0  243m 5520  816 S 83.7  0.1   8350:52 rsyslogd

tmcdonald · Post by **tmcdonald** » Thu Jul 09, 2015 3:21 pm

Hmmm. We've seen similar behavior but usually in those cases the kernel message queue fills up. Yours looks to be processing just fine.

What sort of hardware do you have? It's possible that if you are in a VM, you are hitting an issue where having too many cores/CPUs actually is detrimental, because the hypervisor has to wait longer to decide which VMs can get a slice of compute time:

https://lonesysadmin.net/2008/04/22/why ... m-is-slow/

CFT6Server · Post by **CFT6Server** » Thu Jul 09, 2015 3:30 pm

This is in a VM evironment, but so far I think the resources is set correctly to what's needed. I started with 4vCPUs and it seems to be used quite a bit. We are still populating the instance, so I've added 2 additional cores since then. This is part of a bigger environment and resourcing and cpus seems to be fine. We have hosts with a total of 16cores (2socket x 8 core) per host, so I do not think this would be a contention issue at this point.

I have the resource usages here in this thread: https://support.nagios.com/forum/viewto ... 16&t=33673

I just wanted to make sure that these error messages isn't causing any harm or have potential impacts. Thanks.

tmcdonald · Post by **tmcdonald** » Thu Jul 09, 2015 3:34 pm

Can you grep your syslog for mysql_query() failed please? There might be some failed db inserts/updates/deletes that are causing the message.

CFT6Server · Post by **CFT6Server** » Thu Jul 09, 2015 3:56 pm

Code: Select all

]# cat /var/log/messages | grep "mysql_query() failed"
Jul  6 10:34:22 <Server> ndo2db: Error: mysql_query() failed for 'UPDATE nagios_conninfo SET disconnect_time=NOW(), last_checkin_time=NOW(), data_end_time=FROM_UNIXTIME(0), bytes_processed='0', lines_processed='0', entries_processed='0' WHERE conninfo_id='0''

Jul  7 09:57:15 <Server> ndo2db: Error: mysql_query() failed for 'UPDATE nagios_conninfo SET disconnect_time=NOW(), last_checkin_time=NOW(), data_end_time=FROM_UNIXTIME(0), bytes_processed='0', lines_processed='0', entries_processed='0' WHERE conninfo_id='0''

Jul  9 10:17:52 <Server> ndo2db: Error: mysql_query() failed for 'UPDATE nagios_conninfo SET disconnect_time=NOW(), last_checkin_time=NOW(), data_end_time=FROM_UNIXTIME(0), bytes_processed='0', lines_processed='0', entries_processed='0' WHERE conninfo_id='0''

jdalrymple · Post by **jdalrymple** » Fri Jul 10, 2015 10:57 am

CFT6Server wrote:Looking at top and ndo2db is pretty much pinned at over 90%. What are these and should I be concerned?

The great news is nothing is broken it seems, but yes - to me this is cause for concern. Was the string of errors you blatted out isolated or is it spraying that ndo2db error nonstop?

Also - is your mysql offloaded? I kind of hope not... so that we can tell you DO IT and hopefully it will solve all your problems. I'm afraid you're going to tell me it already is though.

Nagios Support Forum

NDO2DB errors in messges (queue recv error)

NDO2DB errors in messges (queue recv error)

Re: NDO2DB errors in messges (queue recv error)

Re: NDO2DB errors in messges (queue recv error)

Re: NDO2DB errors in messges (queue recv error)

Re: NDO2DB errors in messges (queue recv error)

Re: NDO2DB errors in messges (queue recv error)

Re: NDO2DB errors in messges (queue recv error)

Re: NDO2DB errors in messges (queue recv error)

Re: NDO2DB errors in messges (queue recv error)

Re: NDO2DB errors in messges (queue recv error)