We need a recommended design/solution architecture for the below environment
i) We need to set up Nagios XI monitor for a cloud environment that has around 80 nodes at this point
ii) We need to also set up Nagios XI monitor for an on-premise environment that has 1200+ nodes
iii) The number of nodes in cloud can grow up to 700 in a span of 2 yrs
iv) Each node in on-premise environment has 40 to 50 services on an average ( 40*1200) that need to be monitored.
v) There are around 150+ Unix nodes that route all logs to a centralized syslog server from where log monitoring should happen via Nagios
Our initial design consists of 2 Nagios XI instances (1 for on-premise+1 for cloud) & Fusion for a tactical view.
However, considering the volume of log scraping it has to do for log monitoring from a centralized syslog server and the volume of services it has to monitor (40 services*1300 nodes), we are unsure if the suggested design solution would be sufficient enough (2 Nagios XI instances, Fusion)?
Is there a better design/solution architecture that we can propose for the above environment, in terms of number of
a) Number of Nagios XI instances required
b) Setting up a distributed environment
c) Off-load DB to a common server
d) Performance improvement like using RAM disks etc.,
We also, need ticketing solution implemented using ServiceNow as a part of the monitoring solution.
Please advise.
Also, 4 core ,16 GB Memory , 300 GB HDD is proposed H/W requirement. Would this be sufficient for each instance?
You swift response is much appreciated!
Regards,
Nagios XI - recommened Design/Solution Architecture
Re: Nagios XI - recommened Design/Solution Architecture
I can provide some general advice, but this level of detail is something that would fall under consulting. If you would like, you can email [email protected] for details.
Generally speaking, I will say this:
Generally speaking, I will say this:
- A single XI server should handle 20,000 checks at most before needing to split the load onto another server
- There are many, many factors in play that determine how much a single server can take (check frequency, type, and success rate, to name a few)
- Using a RAM disk and offloading the database can improve performance:
https://assets.nagios.com/downloads/nag ... giosXI.pdf
https://assets.nagios.com/downloads/nag ... Server.pdf - For anything relating to log management, I would recommend using Nagios Logserver instead of XI: https://www.nagios.com/products/nagios-log-server/
- ServiceNow has some integrations, check here for details: https://exchange.nagios.org/directory/P ... er/details
- HA/Failover/Distributed setups are not covered by Nagios - though there are solutions documented, we do not provide direct support for them
Former Nagios employee