Hello ssax,
I did find those endpoints reading the API documentation before, maybe checking if 'is_currently_running' from system/status is not 1 is enough. I was hoping for an endpoint that would give me a JSON response with the information that you can normally get from the system component status table page (
https://nagios.example/nagiosxi/admin/sysstat.php). With a response with something like 'status: healthy/unhealthy' for each component.
System Component Status:
- Monitoring Engine
- Performance Grapher
- Database Maintenance
- Command Subsystem
- Event Manager
- Feed Processor
- Report Engine
- Cleaner
- Nonstop Operations Manager
- System Statistics
GET system/status, returns an object that looks like this.
{
"instance_id": "1",
"instance_name": "localhost",
"status_update_time": "2015-09-21 01:48:14",
"program_start_time": "2015-09-20 12:21:20",
"program_run_time": "48419",
"program_end_time": "0000-00-00 00:00:00",
"is_currently_running": "1",
"process_id": "105075",
"daemon_mode": "1",
"last_command_check": "1969-12-31 18:00:00",
"last_log_rotation": "2015-09-21 00:00:00",
"notifications_enabled": "1",
"active_service_checks_enabled": "1",
"passive_service_checks_enabled": "1",
"active_host_checks_enabled": "1",
"passive_host_checks_enabled": "1",
"event_handlers_enabled": "1",
"flap_detection_enabled": "1",
"process_performance_data": "1",
"obsess_over_hosts": "0",
"obsess_over_services": "0",
"modified_host_attributes": "0",
"modified_service_attributes": "0",
"global_host_event_handler": "xi_host_event_handler",
"global_service_event_handler": "xi_service_event_handler"
}
GET system/statusdetail might be able to provide some of the information I have mentioned above, but most of the objects return "last_check": "<time stamp>". Do you know what these objects return if a component fails to check in? Or the time format that these objects are using. I could potentially check if the last check in was over x amount time.
Also as some background information, I was looking at doing this by querying the API via HTTP because we have had some experiences in the past monitoring assets where Apache is up, and the host is up, but the web page/app wasn't being served properly. We have also had some instances where an application is up, but specific components of the application are busted. If I have to I will probably just set up a basic HTTP check that goes to the page and checks to see if some expected content is there. We already use Nagios to self monitor things like system metrics for the Nagios host. So the only thing I really need to do is confirm it is available. As long as it available, Nagios its self will send us alerts about other issues like high load.
Thanks,
Mark Jackson