Nagios XI component status

DFaught · Post by **DFaught** » Tue May 14, 2019 10:46 am

Hi,
On Nagios XI 5.5.5, I am using the RESTful API to get the System Status Detail, which provides data which I am assuming is similar to
what goes into the Core Component Status/XI System Component Status dashlet. Any chance you can tell me what the criteria are for
the component statuses turning red? Or where I might find this in the PHP files?
Thanks for any help you can provide.

npolovenko · Post by **npolovenko** » Tue May 14, 2019 3:11 pm

Hello, @DFaught. Monitoring engine is the nagios process itself.

Code: Select all

Monitoring Engine -> sudo /usr/local/nagiosxi/scripts/manage_services.sh status nagios

Performance grapher is the npcd daemon.

Code: Select all

Performance Grapher -> sudo /usr/local/nagiosxi/scripts/manage_services.sh status npcd

Database backend is NDO2DB. It moves data from Nagios to the database for processing.

Code: Select all

Database Backend -> sudo /usr/local/nagiosxi/scripts/manage_services.sh status ndo2db

Other components are related to various crons used by Nagios. They are listed here:

Code: Select all

https://support.nagios.com/kb/article.php?id=60

Which components are turning red in your dashboard?

DFaught · Post by **DFaught** » Wed May 15, 2019 10:13 am

We currently have 18 Nagios XI servers split across 2 Fusion servers. Lately we have had a rash of Database Backend going Red issues.
It is a pain to go log into each XI server to see if this condition is occurring. Ideally there would be a Fusion dashlet that would show
for all of the "fused" servers something like the nice single little System Status icon that is at the top right of the XI screens next to my login. I have suggested this in the Fusion forum. The Fused Server Status dashlet that is available does NOT do this, although the name suggests that it does.

My next best option is to do some external scripting that will step through all of the "fused" servers, probably using the RESTful API.
I already have this script, except that I don't know how to interpret the query results to say yes there is a problem or no there is not.
To pursue this option, I really prefer using the RESTful API as opposed to doing command line or other internal things on each server that
are more difficult to script.

Thank you for any help or direction you can provide.

npolovenko · Post by **npolovenko** » Wed May 15, 2019 3:33 pm

@DFaught, The Fused Server Status dashlet will only show if Fusion is able to connect to the XI server. I could see how a more detailed dashlet with major XI component statuses could be useful. I will submit this Feature Request idea for the consideration with our dev team.
In the meantime, you could add a few important localhost services from XI servers to a custom dashboard in Fusion using the "Service Status" dashlet. For example, "Service Status - ndo2db" localhost service on XI servers represents the "Database backend" on the XI dashlet. "Service Status - crond" would represent "Database Maintenance, Command Subsystem, Event Manager, Feed Processor, Report Engine, Cleaner, Nonstop Operations Manager and System Statistics" crons. Following the same logic, you could create other localhost services on XI servers that would be equivalent to component statuses on the XI status dashlet, and have a centralized dashboard in Fusion that would display these services.
Next, you could create a shell script that would ssh into nagios servers and restart major services. This would be a good list of commands to run on XI server when the monitoring engine is down:

service crond stop
service npcd stop
service nagios stop
service ndo2db stop
pkill -9 -u nagios
for i in $(ipcs -q | grep nagios |awk '{print $2}'); do ipcrm -q $i; done
rm -rf /usr/local/nagiosxi/var/dbmaint.lock
rm -rf /usr/local/nagiosxi/var/event_handler.lock
rm -rf /usr/local/nagiosxi/scripts/reconfigure_nagios.lock
service mysqld restart
service ndo2db start
service nagios start
service npcd start
service crond start

However, running these commands doesn't always resolve the issue. In many cases, the Red Database Backend could indicate a large variety of problems with many possible solutions.

Nagios Support Forum

Nagios XI component status

Nagios XI component status

Re: Nagios XI component status

Re: Nagios XI component status

Re: Nagios XI component status