Page 1 of 1

NDO2DB Stopped

Posted: Wed Jan 02, 2019 10:22 am
by Maxwellb99
Hi,

NDO2DB stopped on one of my Nagios instances. I'd like to do some root cause analysis. What's the "first steps"/"likely cause" here?

I grabbed this before restarting ndo2db
Capture.PNG
Thanks,
Maxwell Ramirez

Re: NDO2DB Stopped

Posted: Wed Jan 02, 2019 11:30 am
by lmiltchev
NDOUtils uses the operating system kernel message queue. As the amount of messages increases the kernel settings need to be tuned to allow more messages to be queued and processed. For more information on the topic, read the KB article below:

https://support.nagios.com/kb/article.php?id=139

Re: NDO2DB Stopped

Posted: Wed Jan 02, 2019 3:23 pm
by Maxwellb99
alright, thanks. I'm going to do a little more research. I'm not a sysadmin by training so i'm not 100% sure ... The man page says "POSIX message queues allow processes to exchange data in the form of messages." Is this error a function of the number of checks I have running?

Re: NDO2DB Stopped

Posted: Wed Jan 02, 2019 4:36 pm
by npolovenko
@Maxwellb99, Please run the following command and show me the output:
ipcs -q
Sometimes ndo crashes leaving multiple message ques open.
But sometimes just running a lot of service and host checks requires an increase of the message queue limits.
Have you increased the message queue as suggested in the article?

Re: NDO2DB Stopped

Posted: Wed Jan 02, 2019 4:48 pm
by Maxwellb99
foo.PNG
Yeah before I followed the KB article I had three queues. I cleared them & I'm down to one.
I hope the DB crashing was operator error. Would the DB crash if I stopped ndo2db, performed a restart to nagios service then started ndo2db?

I did it in this order:
sudo systemctl stop ndo2db
sudo systemctl restart nagios
sudo systemctl start ndo2db

as opposed to (from the doc)
systemctl stop nagios.service
systemctl restart ndo2db.service
systemctl start nagios.service

Thanks.

Re: NDO2DB Stopped

Posted: Wed Jan 02, 2019 5:17 pm
by npolovenko
@Maxwellb99, It's better to stop the Nagios service before you stop NDO2DB. Sometimes restarting processes in the incorrect order can cause multiple ndo queues.