we are looking at implementing Nagios XI across all our data centers, but are having some issues with the design. I know there already are several posts about distributed installations, but as always there are a few differences in the way we want to do it, so I wanted to make a new post.
- Central Nagios XI (Active / Passive Cluster)
- Nagios XI cluster (Active / Passive) in each datacenter.
- All configurations would be made in the central system and then have them apply to the corresponding datacenter installations.
- Status information should also be synchronized upwards to the central system.
I have been looking but can not find anything that fits 100% for the type of install we want. I have looked at something such as Gearman, but we would like the data centers to keep working if the internal network to the datacenter goes down, and then get the cached results once the line is working again.
Once way I was thinking of doing this, would be to grab the systems directly from the Central XI's mySql database, and just query for hostgroup: datacenter_%dc_name%. Then we would format that information and send it to each NagiosXI and add it to the import folder or try to add it to the DB directly. However, we would prefer to use some already made tools for this.
In the future, we might also want to system scan functions (auto discovery and service detection) to be able to run on the DC Nagios XI systems, and then update the central system for that as well. However, that would be something to implement farther down the road, but i believe its important to have in mind when designing the system.
We might add a Gearman installation in each DC just to help with the load, and the possibility to add workers in the future if need be. I have attached a image of the design (Just ignore the service now part, as its the incident manager). I know its a bit messy but it helps a little in the understanding of the "flow" data.

Link to image:
https://www.nilsson.so/nagios_design_v1.png
Thanks!!
Jonas