Hi Team,
We have strange issue related to Kernel queue. We are using NDO2DB and increased the messages limit but still have the issue. No other error or any service issue but suddenly monitoring getting stop. Please suggest. (NDO3 had the issue so degraded to NDO2DB before 6months)
Error:
ndo2db: Error: max retries exceeded sending message to queue. Kernel queue parameters may need to be tuned. See README.
ndo2db: Warning: queue send error, retrying...
Present Limit: (tested with changing the numbers)
kernel.msgmnb = 99662144000
kernel.msgmax = 99662144000
kernel.shmmax = 42949672950
kernel.shmall = 2684354560
kernel.msgmni = 91120000
------ Message Queues --------
key msqid owner perms used-bytes messages
0xb8000002 7897089 nagios 600 877895680 857320
Nagios user limit:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 160158
max locked memory (kbytes, -l) 128
max memory size (kbytes, -m) unlimited
open files (-n) 10000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 20480
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Kernel:
3.10.0-1062.el7.x86_64 #1 SMP Thu Jul 18 20:25:13 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.7 (Maipo)
PHP 5.4.16 (cli) (built: Jun 19 2018 13:09:01)
IO:
avg-cpu: %user %nice %system %iowait %steal %idle
10.16 0.00 3.42 0.07 0.00 86.35
XI Details:
Version 5.8.5, hosts=899, services=9658, DB=offloaded, HW=no resource crunch, No other OS/DB log/error
Thanks,
Vaibhav
NDO2DB Kernel Queue issue
-
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: NDO2DB Kernel Queue issue
Hi Vaibhav,
Thanks for contacting the support team at Nagios.
Normally, increasing those limits is very helpful. What is the total check load of this server, host + service checks?
Try performing a full restart and clear the message queues.
If that does not resolve the issue, please send us a current system profile for us to review. Thanks, Benjamin
To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" butto
Thanks for contacting the support team at Nagios.
Normally, increasing those limits is very helpful. What is the total check load of this server, host + service checks?
Try performing a full restart and clear the message queues.
Code: Select all
systemctl stop nagios
systemctl stop ndo2db
systemctl stop crond
pkill -9 -u nagios
echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | mysql -u root -pnagiosxi nagiosxi
mysqlcheck -f -r -u root -pnagiosxi --all-databases --use-frm
if grep --quiet pgsql /usr/local/nagiosxi/html/config.inc.php; then systemctl stop postgresql; fi;
systemctl restart mariadb
rm -f /usr/local/nagios/var/rw/nagios.cmd
rm -f /usr/local/nagios/var/nagios.lock
rm -f /var/run/nagios.lock
rm -f /usr/local/nagios/var/ndo.sock
rm -f /usr/local/nagios/var/ndo2db.lock
rm -f /var/lib/mrtg/mrtg_l
rm -f /usr/local/nagiosxi/var/*.lock
rm -f /usr/local/nagiosxi/tmp/*.lock
for i in `ipcs -q | grep nagios |awk '{print $2}'`; do ipcrm -q $i; done
pkill python
if grep --quiet pgsql /usr/local/nagiosxi/html/config.inc.php; then service postgresql start; fi;
systemctl restart httpd
systemctl start ndo2db
systemctl start nagios
systemctl start npcd
systemctl start crond
To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" butto
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: NDO2DB Kernel Queue issue
Hi,
We have already performed this steps also rebooted the servers but same issue. We can not share the profile due to security restriction.
Thanks,
Vaibhav
We have already performed this steps also rebooted the servers but same issue. We can not share the profile due to security restriction.
Thanks,
Vaibhav
-
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: NDO2DB Kernel Queue issue
HI Vaibhav,
Due to security restrictions, let's open a ticket for this issue as it will make it easier to privately share log files.
To open a ticket, please visit:
https://support.nagios.com/tickets
Please provide the following information in the ticket:
Thanks,
Benjamin
Due to security restrictions, let's open a ticket for this issue as it will make it easier to privately share log files.
To open a ticket, please visit:
https://support.nagios.com/tickets
Please provide the following information in the ticket:
Code: Select all
ipcs -a
top -b -n 1
tail -n 500 /usr/local/nagios/var/nagios.log
# Database log
/var/log/mariadb/mariadb.log
/var/log/mysqld.log
Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: NDO2DB Kernel Queue issue
Ticket is already opened for this issue by Luca
-
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: NDO2DB Kernel Queue issue
Hi,
Okay, would that be ticket #382026?
Okay, would that be ticket #382026?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: NDO2DB Kernel Queue issue
Hi,
Okay, we'll close out this forum post then, and we'll move this over to that ticket to avoid duplicates.
Thanks,
Benjamin
Okay, we'll close out this forum post then, and we'll move this over to that ticket to avoid duplicates.
Thanks,
Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!