THIS KNOWLEDGE BASE HAS BEEN ARCHIVED AND IS NO LONGER BEING UPDATED
Please visit library.nagios.com/docs for the latest and most up-to-date documentation.
Home » Categories » Multiple Categories

NDOUtils - Message Queue Exceeded

Problem Description

In Nagios you experience the following symptoms:

  • Missing hosts or services or status data
  • Takes a very long time to restart the Nagios process
  • Unusually high CPU load
  • A flood of messages in the /var/log/messages related to ndo2db like:
    ndo2db: Error: max retries exceeded sending message to queue. Kernel queue parameters may neeed to be tuned. See README.
    ndo2db: Warning: queue send error, retrying...

 

In addition to this, you may see multiple queues for the nagios user when executing the following command:

ipcs -q

 

The following output is produced:

------ Message Queues --------
key msqid owner perms used-bytes messages
0xee070002 1409024 nagios 600 100672512 98313
0x50070002 1441793 nagios 600 0 0

 

 

Explanation

NDOUtils uses the operating system kernel message queue. As the amount of messages increases the kernel settings need to be tuned to allow more messages to be queued and processed.

 

 

Resolving The Problem

First identify the values you are currently using:

grep 'kernel.msgmnb' /etc/sysctl.conf
grep 'kernel.msgmax' /etc/sysctl.conf
grep 'kernel.msgmni' /etc/sysctl.conf

 

The following output is produced (or similar):

kernel.msgmnb = 131072000
kernel.msgmax = 131072000
kernel.msgmni = 256000

If the settings are not already defined then no output will be displayed for that command and will need to be defined in the /etc/sysctl.conf file.

 

For msgmnb and msgmax the same value should be used for both. Recommended values are:

  • 131072000
  • 262144000

 

For msgmni the recommended value is:

  • 512000

 

Values higher than these may not be the solution to your problem unless you have a high performance server.

 

For msgmnb and msgmax the following commands will update /etc/sysctl.conf with increased values. This example will increase them to 262144000.

sed -i 's/^kernel\.msgmnb.*/kernel\.msgmnb = 262144000/g' /etc/sysctl.conf
sed -i 's/^kernel\.msgmax.*/kernel\.msgmax = 262144000/g' /etc/sysctl.conf

 

The following commands are for the msgmni option. For the grep command you executed previously:

If it did not return output, this command will add the setting to the /etc/sysctl.conf file:

echo 'kernel.msgmni = 512000' >> /etc/sysctl.conf

If it did return output, this command will update the setting in the /etc/sysctl.conf file:

sed -i 's/^kernel\.msgmni.*/kernel\.msgmni = 512000/g' /etc/sysctl.conf

 

 

After making those changes, execute the following command:

sysctl -p

 

The following output is produced (or similar):

net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 262144000
kernel.msgmax = 262144000
kernel.shmmax = 4294967295
kernel.shmall = 268435456
kernel.msgmni = 512000

You can see the increased values have been applied to the kernel.

 

You need to restart services using the commands below:

RHEL 6 | CentOS 6 | Oracle Linux 6 | Ubuntu 14

service nagios stop
service ndo2db restart
service nagios start

 

RHEL 7 | CentOS 7 | Oracle Linux 7 | Debian | Ubuntu 16/18

systemctl stop nagios.service
systemctl restart ndo2db.service
systemctl start nagios.service

 

Once you have completed these steps you should check the message queues by executing the following command:

ipcs -q

 

If you see more than one queue for the user nagios execute the following command to clear the queues:

for i in `ipcs -q | grep nagios |awk '{print $2}'`; do ipcrm -q $i; done

 

You can then watch the queues for 10-15 minutes to ensure they are being processed:

watch ipcs -q

 

To stop watching the queues press Ctrl + C on the keyboard.

 

 

Other Recommendations

If you find the message queue is not being processed quickly enough the problem may be related to MySQL / MariaDB. Make sure that the DB server has enough CPU and memory resources and if the DB server is on the same server as the Nagios server you should look at offloading the DB to a dedicated server.

 

 

Final Thoughts

For any support related questions please visit the Nagios Support Forums at:

http://support.nagios.com/forum/



Special Offer For Knowledgebase Visitors! Get a huge discount on Nagios Log Server by clicking below.

Get 60% Off Nagios Log Server!

Did you know? Nagios provides complete monitoring of: Windows, Linux, UNIX, Servers, Websites, SNMP, DHCP, DNS, Email, Storage, Files, Apache, IIS, EC2, and more!

0 (0)
Article Rating (No Votes)
Rate this article
  • Icon PDFExport to PDF
  • Icon MS-WordExport to MS Word
Attachments Attachments
There are no attachments for this article.
Related Articles RSS Feed
Nagios XI - MRTG Reports SNMP_Session Errors
Viewed 7756 times since Wed, Jul 27, 2016
Nagios XI - Migrate Performance Data
Viewed 14811 times since Tue, Jan 26, 2016
Nagios XI - Warning: Duplicate definition found for contact ’xi_default_contact’
Viewed 8810 times since Tue, Jan 26, 2016
Nagios XI - Scheduled Reports Not Running
Viewed 6881 times since Thu, Aug 10, 2017
Nagios XI - MSSQL Wizards - Adaptive Server connection failed
Viewed 13266 times since Thu, Aug 3, 2017
Nagios XI - Can’t Log Into The Web Interface
Viewed 67343 times since Tue, Jan 27, 2015
Nagios XI - Problems with $ Signs in the Check Command
Viewed 10877 times since Tue, Jan 26, 2016
Nagios XI - ERROR: unable to open include file: conf.d/*.cfg
Viewed 10322 times since Sun, May 29, 2016
Nagios XI - Installing Latest SourceGuardian Loaders
Viewed 9696 times since Mon, Jun 18, 2018
Turning Off PHP Notices and Deprecated Messages
Viewed 5678 times since Wed, Oct 20, 2021