Page 1 of 2
Nagios Portal Issue
Posted: Mon Feb 01, 2021 9:28 am
by mejokj
Hi Team,
We are getting below blank details on Nagios dashboard. Attached screenshot for reference.
We have tried the below steps but the issue still persists.
============================================================
systemctl stop crond
systemctl stop npcd
systemctl stop nagios
systemctl stop ndo2db
killall -9 ndo2db
pkill -9 -u nagios
for i in $(ipcs -q | grep nagios |awk '{print $2}'); do ipcrm -q $i; done
rm -rf /usr/local/nagiosxi/var/dbmaint.lock
rm -rf /usr/local/nagiosxi/var/event_handler.lock
rm -rf /usr/local/nagiosxi/scripts/reconfigure_nagios.lock
systemctl start ndo2db
systemctl start nagios
systemctl start npcd
systemctl start crond
===============================================
Note: If we refresh multiple times, some host/service summary will show, and during the issue time, the specific host/hostgroup details also not showing.
Thanks,
Re: Nagios Portal Issue
Posted: Mon Feb 01, 2021 1:23 pm
by mejokj
Removed
Re: Nagios Portal Issue
Posted: Mon Feb 01, 2021 6:07 pm
by dchurch
If you recently added a bunch of servers being monitored, ndo2db might be choking on the amount of traffic in the form of perf data it's sending to the database.
This was in your system profile (output from "ipcs") - the numbers are a little high:
Code: Select all
------ Message Queues --------
key msqid owner perms used-bytes messages
0x94000080 294957 nagios 600 12527616 12234
0xa8000080 294966 nagios 600 13968384 13641
If you upgrade Nagios to at least version 5.7.0, this problem may go away.
ndo2db, (sometimes called the Database Backend) is no longer needed in Nagios XI.
ndo2db is our older technology that basically listens on a UNIX socket for database inserts, then handles the actual insertion into the database. It has limits, being that it runs into issues when it tries to insert more than the database can handle. In newer versions (
Nagios XI 5.7.0 and later), this was replaced by just writing directly to the database from the Nagios worker threads. In addition to being able to handle more database inserts, this resulted in an overall performance boost, too.
If you absolutely don't want to, or can't upgrade, another solution would be increasing the message queue size via a sysctl.conf entry.
Re: Nagios Portal Issue
Posted: Tue Feb 02, 2021 10:45 am
by mejokj
Thanks for the reply.
We will plan for an upgrade.
We could see that kernel parameters kernel.msgmnb and kernel.msgmax is already set to the recommended value of 262144000
Shall we increase the value more as temporary to fix the issue and what should be the next recommended value?
Thanks
Re: Nagios Portal Issue
Posted: Wed Feb 03, 2021 12:23 pm
by benjaminsmith
HI,
We could see that kernel parameters kernel.msgmnb and kernel.msgmax is already set to the recommended value of 262144000
I would try doubling those values, for example (Cent 7 commands).
Code: Select all
sed -i 's/^kernel\.msgmnb.*/kernel\.msgmnb = 524288000 /g' /etc/sysctl.conf
sed -i 's/^kernel\.msgmax.*/kernel\.msgmax = 524288000 /g' /etc/sysctl.conf
One of the best things you can do, however, is to increase the check intervals so the Nagios process has to schedule fewer checks every minute. Increasing the interval from 5 minutes to 10 minutes will really help.
--Benjamin
Re: Nagios Portal Issue
Posted: Mon Feb 15, 2021 8:33 am
by mejokj
Hi benjaminsmith,
We have updated kernel parameters kernel.msgmnb and kernel.msgmax to 524288000.
Nagios looks better now. However, the apply configuration is taking too much time and it did not show any error. Attached screenshot for your reference.
Did not see any configuration error
Is this because of ndo2db?
Thanks
Re: Nagios Portal Issue
Posted: Mon Feb 15, 2021 4:12 pm
by benjaminsmith
Hi,
Go ahead and run the database repair script, and then try to Apply Configuration again.
Code: Select all
/usr/local/nagiosxi/scripts/repair_databases.sh
If it's still taking a long time, please run the following tail command, apply configuration and post the full output to the thread.
Code: Select all
tail -f /usr/local/nagiosxi/var/cmdsubsys.log
Thanks,
Benjamin
Re: Nagios Portal Issue
Posted: Wed Mar 03, 2021 3:10 am
by mejokj
Hi Benjaminsmith,
We have updated the message queue size via a sysctl.conf entry to the highest value 524288000. But the issue still persists (blank details on Nagios dashboard). There is no error found during the database repair
So We have decided to upgrade Nagios as suggested. We have few questions about the upgrade.
1) Do we need to upgrade to the latest version or exactly 5.7.0? The existing version is Nagios 5.6.11
2) Please provide the steps for downgrade in case if any issues occur during the upgrade.
3) Any common issue and fixes are there during the upgrades from the 5.6 to 5.7 version
Thanks
Re: Nagios Portal Issue
Posted: Thu Mar 04, 2021 11:05 am
by benjaminsmith
Hi,
) Do we need to upgrade to the latest version or exactly 5.7.0? The existing version is Nagios 5.6.11
I would recommend upgrading to the latest version, 5.8.2, as we continue to make improvements to ndo3, and that version will have the latest updates.
Upgrading Nagios XI
Please provide the steps for downgrade in case if any issues occur during the upgrade.
Here are the steps for a standard downgrade (i.e. local database host) of NDO. If the database is offloaded, let me know and I will provide the additional steps for that setup.
Code: Select all
systemctl stop nagios
cd /tmp
rm -rf /tmp/nagiosxi
wget https://assets.nagios.com/downloads/nagiosxi/5/xi-5.6.14.tar.gz
tar zxf xi-5.6.14.tar.gz
cd /tmp/nagiosxi/subcomponents/ndoutils
./install
systemctl enable ndo2db
Then edit your /usr/local/nagios/etc/nagios.cfg and make sure this line is uncommented:
Code: Select all
broker_module=/usr/local/nagios/bin/ndomod.o config_file=/usr/local/nagios/etc/ndomod.cfg
Make sure this line is commented:
Code: Select all
#broker_module=/usr/local/nagios/bin/ndo.so /usr/local/nagios/etc/ndo.cfg
Then start the nagios service:
Any common issue and fixes are there during the upgrades from the 5.6 to 5.7 version
If you're running Cent 6, then upgrading is not going to be possible. You'll need to migrate to a newer, supported distribution.
See:
https://support.nagios.com/kb/article/m ... r-892.html
Be sure to take full backup or upgrade your test server first, before proceeding with the upgrade on the production server.
Backing Up And Restoring Your Nagios XI System
Re: Nagios Portal Issue
Posted: Fri Mar 12, 2021 1:39 pm
by mejokj
Hi Benjamin,
We have upgraded the Nagiosxi to the latest version 5.8.2. But the issue still persists.
Sent you the system profile file as PM to you.
root@nagiosxi store]# cat /etc/sysctl.conf
# System default settings live in /usr/lib/sysctl.d/00-system.conf.
# To override those settings, enter new settings here, or in an /etc/sysctl.d/<name>.conf file
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
kernel.msgmnb = 524288000
kernel.msgmax = 524288000
kernel.shmmax = 4294967295
kernel.shmall = 268435456
[root@nagiosxi store]#
Please help us with this as this issue is pending for a long time.
Thanks