ndo2DB errors and IPC

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
vazudevan
Posts: 36
Joined: Fri Oct 21, 2016 4:52 am

ndo2DB errors and IPC

Post by vazudevan »

Hey,

We are often noticing ndo2db errors in /var/log/messages.

Code: Select all

ndo2db: Error: max retries exceeded sending message to queue. Kernel queue parameters may neeed to be tuned. See README.
the IPC queue at this time are full.

Code: Select all

ipcs -q

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages    
0x97010080 1212416    nagios     600        262144000    256000    
here are the IPC settings in /etc/sysctl.cfg

Code: Select all

kernel.msgmnb = 262144000
kernel.msgmax = 262144000
kernel.shmmax = 4294967295
kernel.shmall = 268435456
kernel.msgmni = 512000
They get to normal when we clear the IPC queue with for i in `ipcs -q | grep nagios |awk '{print $2}'`; do ipcrm -q $i; done and stop/start nagios and ndo2db

This is coming up about every 6 hours or so. How do we handle it?
FYI: Our setup is federated and has 3603 hosts with 15053 services all PASSIVE, MariaDB hosted on a separate server. No loads / cpu / memory contention on the DB server.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: ndo2DB errors and IPC

Post by cdienger »

Have any adjustments been made to the check_result_reaper_frequency or max_check_result_reaper_time options in nagios.cfg? You can check these options in the gui under Configure > CCM Admin > Core Configs > General. The defaults are 10 and 30 respectively. Try setting them to 3 and 10 instead to have check results processed more frequently.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
vazudevan
Posts: 36
Joined: Fri Oct 21, 2016 4:52 am

Re: ndo2DB errors and IPC

Post by vazudevan »

it was already set to the higher frequency. The condition is in spite of the setting.

Code: Select all

[root@phlprcnagnxi001 etc]# grep reaper nagios.cfg 
# check_result_reaper_frequency=10
check_result_reaper_frequency=3
# max_check_result_reaper_time=30
max_check_result_reaper_time=10
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: ndo2DB errors and IPC

Post by cdienger »

Does the queue fluctuate at all or does it stay full pretty much all the time once it becomes full? Sometimes it is necessary to increase the queue size beyond even what the kb recommends. This shouldn't be a problem as long as the system isn't displaying other symptoms like not displaying updated check information. I would try doubling the current kernel.msgmnb: https://support.nagios.com/kb/article/n ... d-139.html
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked