Issues with ndo2db
Issues with ndo2db
Hi team,
We are seeing issues with ndo2db, below are the messages seen in messages file.
Need immediate assistance on below issue.
Warning: Retrying message send. This can occur because you have too few messages allowed or too few total bytes allowed in message queues. You are currently using 256000 of 5120000 messages and 262144000 of 262144000 bytes in the queue. See README for kernel tuning options.
Oct 4 11:13:46 nagmonus1 ndo2db: Error: max retries exceeded sending message to queue. Kernel queue parameters may need to be tuned. See README.
Oct 4 11:13:46 nagmonus1 ndo2db: Warning: queue send error, retrying...
Oct 4 11:14:06 nagmonus1 ndo2db: Error: max retries exceeded sending message to queue. Kernel queue parameters may need to be tuned. See README.
Oct 4 11:14:06 nagmonus1 ndo2db: Warning: queue send error, retrying...
Oct 4 11:14:26 nagmonus1 ndo2db: Error: max retries exceeded sending message to queue. Kernel queue parameters may need to be tuned. See README.
Oct 4 11:14:26 nagmonus1 ndo2db: Warning: queue send error, retrying...
# cat /etc/sysctl.conf | grep -v ^#
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
kernel.msgmnb = 262144000
kernel.msgmax = 2621440000
kernel.shmmax = 42949672950
kernel.shmall = 2684354560
# ipcs -q
------ Message Queues --------
key msqid owner perms used-bytes messages
0xffffffff 2359296 nagios 600 262144000 256000
kernel.msgmni = 5120000
net.core.somaxconn = 40960
# ipcs -l
------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 41943039
max total shared memory (kbytes) = 10737418240
min seg size (bytes) = 1
------ Semaphore Limits --------
max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 32000
max ops per semop call = 32
semaphore max value = 32767
------ Messages: Limits --------
max queues system wide = 5120000
max size of message (bytes) = -1673527296
default max size of queue (bytes) = 262144000
We are seeing issues with ndo2db, below are the messages seen in messages file.
Need immediate assistance on below issue.
Warning: Retrying message send. This can occur because you have too few messages allowed or too few total bytes allowed in message queues. You are currently using 256000 of 5120000 messages and 262144000 of 262144000 bytes in the queue. See README for kernel tuning options.
Oct 4 11:13:46 nagmonus1 ndo2db: Error: max retries exceeded sending message to queue. Kernel queue parameters may need to be tuned. See README.
Oct 4 11:13:46 nagmonus1 ndo2db: Warning: queue send error, retrying...
Oct 4 11:14:06 nagmonus1 ndo2db: Error: max retries exceeded sending message to queue. Kernel queue parameters may need to be tuned. See README.
Oct 4 11:14:06 nagmonus1 ndo2db: Warning: queue send error, retrying...
Oct 4 11:14:26 nagmonus1 ndo2db: Error: max retries exceeded sending message to queue. Kernel queue parameters may need to be tuned. See README.
Oct 4 11:14:26 nagmonus1 ndo2db: Warning: queue send error, retrying...
# cat /etc/sysctl.conf | grep -v ^#
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
kernel.msgmnb = 262144000
kernel.msgmax = 2621440000
kernel.shmmax = 42949672950
kernel.shmall = 2684354560
# ipcs -q
------ Message Queues --------
key msqid owner perms used-bytes messages
0xffffffff 2359296 nagios 600 262144000 256000
kernel.msgmni = 5120000
net.core.somaxconn = 40960
# ipcs -l
------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 41943039
max total shared memory (kbytes) = 10737418240
min seg size (bytes) = 1
------ Semaphore Limits --------
max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 32000
max ops per semop call = 32
semaphore max value = 32767
------ Messages: Limits --------
max queues system wide = 5120000
max size of message (bytes) = -1673527296
default max size of queue (bytes) = 262144000
-
dwasswa
Re: Issues with ndo2db
Hi @boscorp,
NDOUtils uses the operating system kernel message queue. As the amount of messages increases the kernel settings need to be tuned to allow more messages to be queued and processed.
First identify the values you are currently using:
The following output is produced (or similar):
If the settings are not already defined then no output will be displayed for that command and will need to be defined in thefile.
For msgmnb and msgmax the same value should be used for both. Recommended values are:
For msgmni the recommended value is:
Values higher than these may not be the solution to your problem unless you have a high performance server.
For msgmnb and msgmax the following commands will update /etc/sysctl.conf with increased values. This example will increase them to 262144000.
The following commands are for the msgmni option. For the grep command you executed previously:
If it did not return output, this command will add the setting to the file:
If it did return output, this command will update the setting in the file:
After making those changes, execute the following command:
The following output is produced (or similar):
You can see the increased values have been applied to the kernel.
Finally execute the following commands:
Once you have completed these steps you should check the message queues by executing the following command:
If you see more than one queue for the user nagios execute the following command to clear the queues:
You can then watch the queues for 10-15 minutes to ensure they are being processed:
To stop watching the queues press Ctrl + C on the keyboard.
Please let us know if you have any questions
NDOUtils uses the operating system kernel message queue. As the amount of messages increases the kernel settings need to be tuned to allow more messages to be queued and processed.
First identify the values you are currently using:
Code: Select all
grep 'kernel.msgmnb' /etc/sysctl.conf
grep 'kernel.msgmax' /etc/sysctl.conf
grep 'kernel.msgmni' /etc/sysctl.conf
Code: Select all
kernel.msgmnb = 131072000
kernel.msgmax = 131072000
kernel.msgmni = 256000
Code: Select all
/etc/sysctl.conf For msgmnb and msgmax the same value should be used for both. Recommended values are:
Code: Select all
131072000
262144000Code: Select all
512000For msgmnb and msgmax the following commands will update /etc/sysctl.conf with increased values. This example will increase them to 262144000.
Code: Select all
sed -i 's/^kernel\.msgmnb.*/kernel\.msgmnb = 262144000/g' /etc/sysctl.conf
sed -i 's/^kernel\.msgmax.*/kernel\.msgmax = 262144000/g' /etc/sysctl.confThe following commands are for the msgmni option. For the grep command you executed previously:
If it did not return output, this command will add the setting to the
Code: Select all
/etc/sysctl.confCode: Select all
echo 'kernel.msgmni = 512000' >> /etc/sysctl.confCode: Select all
/etc/sysctl.confCode: Select all
sed -i 's/^kernel\.msgmni.*/kernel\.msgmni = 512000/g' /etc/sysctl.conf
After making those changes, execute the following command:
Code: Select all
sysctl -pThe following output is produced (or similar):
Code: Select all
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 262144000
kernel.msgmax = 262144000
kernel.shmmax = 4294967295
kernel.shmall = 268435456
kernel.msgmni = 512000
Finally execute the following commands:
Code: Select all
service nagios stop
service ndo2db restart
service nagios start
Once you have completed these steps you should check the message queues by executing the following command:
Code: Select all
ipcs -qIf you see more than one queue for the user nagios execute the following command to clear the queues:
Code: Select all
for i in `ipcs -q | grep nagios |awk '{print $2}'`; do ipcrm -q $i; doneYou can then watch the queues for 10-15 minutes to ensure they are being processed:
Code: Select all
watch ipcs -qTo stop watching the queues press Ctrl + C on the keyboard.
Please let us know if you have any questions
Re: Issues with ndo2db
We have implemented the changes, and below is the output. We have done this in past as well, but still we faced issues.
# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
kernel.msgmnb = 262144000
kernel.msgmax = 262144000
kernel.shmmax = 4294967295
kernel.shmall = 268435456
kernel.msgmni = 512000
net.core.somaxconn = 40960
# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
kernel.msgmnb = 262144000
kernel.msgmax = 262144000
kernel.shmmax = 4294967295
kernel.shmall = 268435456
kernel.msgmni = 512000
net.core.somaxconn = 40960
-
dwasswa
Re: Issues with ndo2db
The problem may be related to MySQL / MariaDB. Make sure that the DB server has enough CPU and memory resources.
If the DB server is on the same server as the Nagios server you should look at offloading the DB to a dedicated server.
If the DB server is on the same server as the Nagios server you should look at offloading the DB to a dedicated server.
Re: Issues with ndo2db
Earlier we were hosting the DB on a dedicated server, as per the recommendation from one the nagios support team member, we have moved it to local again.
The server is not running out of any CPU or memory, can you see what other options we can see here.
The server is not running out of any CPU or memory, can you see what other options we can see here.
Re: Issues with ndo2db
Edit the sysctl.cong file and double the following entries by changing them from
to
Run the following to activate the changes.
Then run the following to stop the processes and clear out the message queue.
See if that fixes the message queue from filling up.
Code: Select all
kernel.msgmnb = 262144000
kernel.msgmax = 262144000Code: Select all
kernel.msgmnb = 524288000
kernel.msgmax = 524288000Code: Select all
sysctl -p Code: Select all
service nagios stop
service ndo2db stop
for i in `ipcs -q | grep nagios |awk '{print $2}'`; do ipcrm -q $i; done
service ndo2db start
service nagios startBe sure to check out our Knowledgebase for helpful articles and solutions!