Hi,
On Nagios XI 5.5.5, I am using the RESTful API to get the System Status Detail, which provides data which I am assuming is similar to
what goes into the Core Component Status/XI System Component Status dashlet. Any chance you can tell me what the criteria are for
the component statuses turning red? Or where I might find this in the PHP files?
Thanks for any help you can provide.
Nagios XI component status
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Nagios XI component status
Hello, @DFaught. Monitoring engine is the nagios process itself.
Performance grapher is the npcd daemon.
Database backend is NDO2DB. It moves data from Nagios to the database for processing.
Other components are related to various crons used by Nagios. They are listed here:
Which components are turning red in your dashboard?
Code: Select all
Monitoring Engine -> sudo /usr/local/nagiosxi/scripts/manage_services.sh status nagiosCode: Select all
Performance Grapher -> sudo /usr/local/nagiosxi/scripts/manage_services.sh status npcdCode: Select all
Database Backend -> sudo /usr/local/nagiosxi/scripts/manage_services.sh status ndo2dbCode: Select all
https://support.nagios.com/kb/article.php?id=60As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Nagios XI component status
We currently have 18 Nagios XI servers split across 2 Fusion servers. Lately we have had a rash of Database Backend going Red issues.
It is a pain to go log into each XI server to see if this condition is occurring. Ideally there would be a Fusion dashlet that would show
for all of the "fused" servers something like the nice single little System Status icon that is at the top right of the XI screens next to my login. I have suggested this in the Fusion forum. The Fused Server Status dashlet that is available does NOT do this, although the name suggests that it does.
My next best option is to do some external scripting that will step through all of the "fused" servers, probably using the RESTful API.
I already have this script, except that I don't know how to interpret the query results to say yes there is a problem or no there is not.
To pursue this option, I really prefer using the RESTful API as opposed to doing command line or other internal things on each server that
are more difficult to script.
Thank you for any help or direction you can provide.
It is a pain to go log into each XI server to see if this condition is occurring. Ideally there would be a Fusion dashlet that would show
for all of the "fused" servers something like the nice single little System Status icon that is at the top right of the XI screens next to my login. I have suggested this in the Fusion forum. The Fused Server Status dashlet that is available does NOT do this, although the name suggests that it does.
My next best option is to do some external scripting that will step through all of the "fused" servers, probably using the RESTful API.
I already have this script, except that I don't know how to interpret the query results to say yes there is a problem or no there is not.
To pursue this option, I really prefer using the RESTful API as opposed to doing command line or other internal things on each server that
are more difficult to script.
Thank you for any help or direction you can provide.
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Nagios XI component status
@DFaught, The Fused Server Status dashlet will only show if Fusion is able to connect to the XI server. I could see how a more detailed dashlet with major XI component statuses could be useful. I will submit this Feature Request idea for the consideration with our dev team.
In the meantime, you could add a few important localhost services from XI servers to a custom dashboard in Fusion using the "Service Status" dashlet. For example, "Service Status - ndo2db" localhost service on XI servers represents the "Database backend" on the XI dashlet. "Service Status - crond" would represent "Database Maintenance, Command Subsystem, Event Manager, Feed Processor, Report Engine, Cleaner, Nonstop Operations Manager and System Statistics" crons. Following the same logic, you could create other localhost services on XI servers that would be equivalent to component statuses on the XI status dashlet, and have a centralized dashboard in Fusion that would display these services.
Next, you could create a shell script that would ssh into nagios servers and restart major services. This would be a good list of commands to run on XI server when the monitoring engine is down:
In the meantime, you could add a few important localhost services from XI servers to a custom dashboard in Fusion using the "Service Status" dashlet. For example, "Service Status - ndo2db" localhost service on XI servers represents the "Database backend" on the XI dashlet. "Service Status - crond" would represent "Database Maintenance, Command Subsystem, Event Manager, Feed Processor, Report Engine, Cleaner, Nonstop Operations Manager and System Statistics" crons. Following the same logic, you could create other localhost services on XI servers that would be equivalent to component statuses on the XI status dashlet, and have a centralized dashboard in Fusion that would display these services.
Next, you could create a shell script that would ssh into nagios servers and restart major services. This would be a good list of commands to run on XI server when the monitoring engine is down:
However, running these commands doesn't always resolve the issue. In many cases, the Red Database Backend could indicate a large variety of problems with many possible solutions.service crond stop
service npcd stop
service nagios stop
service ndo2db stop
pkill -9 -u nagios
for i in $(ipcs -q | grep nagios |awk '{print $2}'); do ipcrm -q $i; done
rm -rf /usr/local/nagiosxi/var/dbmaint.lock
rm -rf /usr/local/nagiosxi/var/event_handler.lock
rm -rf /usr/local/nagiosxi/scripts/reconfigure_nagios.lock
service mysqld restart
service ndo2db start
service nagios start
service npcd start
service crond start
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.