This guide on Best Practices is about the Nagios XI server and different considerations you can take in relation to it's configuration.
XI Server Considerations
Make sure your XI server has its timezone correctly defined.
Configure Timezone
Admin > Manage System Config
When using NTP make sure it's synchronized with a trusted time source like pool.ntp.org.
If this is a virtual machine, don't sync its time with the hypervisor (ESXi, HyperV etc.)!
Can be the source of confusing problems such as:
Apply configuration throwing errors
Performance data not being processed
Having the right amount of CPU cores is important but so too is the speed of those cores. Not all plugins and processes are multi-threaded, so a higher speed CPU is going to benefit. A 3.4GHz CPU will do a lot more than a 2.2GHz one.
Refer to the XI Hardware Requirements guide.
How much memory do you need on an XI system? When all the hosts and services in XI are healthy, the amount of memory used is far less compared to a major system outage. When XI fires off event handlers they consume memory, if there is a major outage and a lot of event handlers are being executed, a lot of memory is being consumed. It doesn’t take long for 6GB of memory to be used.
Generally speaking you should have at least 50% more memory than needed.
Refer to the XI Hardware Requirements guide.
Configuring Nagios XI with a RAM Disk is highly recommended as the number of monitored objects increase. The more things you are monitoring the more disk I/O occurs. By directing this traffic a RAM Disk, the time it takes for that I/O operation to complete is drastically faster.
Reduces disk I/O & load
Speeds up processing of performance data
Speeds up processing of spooled check results
Speeds up nagios restarts
Refer to the XI RAM Disk procedure.
Greatly improves overall performance
Compliments RAM Disk
Helps read/writes with:
Logs
Database
Performance Graphs
Reports
RAID allows for much larger disk capacities than SSD can provide, however it would be very hard for a spinning disk RAID set to beat the performance of SSD.
Keep in mind if you implement SSD you should implement RAID1 sets for redundancy purposes.
rrdcached is a way of accumulating the received performance data and then processing it in a batch job. It helps with larger installations and can reduce I/O, however it can also result with performance graphs lagging behind the realtime results.
Refer to the XI rrdcached procedure.
On larger installations there can be a lot more data being written to the databases, which in turn can result in a lot of CPU usage directed away from actual monitoring.
Offloading to a separate server will remove this CPU usage from your monitoring server.
Of course, make sure you monitor the offloaded server!
Disk / CPU / Memory / Tables / Service
Refer to the guide Monitoring the Nagios XI "localhost" for notes about services
Refer to the XI Offloaded DB procedure.
Define is what is import to you in a disaster. Once you have clearly defined goals and outcomes you can plan appropriately and test.
These presentations cover DR options:
Have you scheduled your backups in Nagios XI?
Admin > System Backups
Schedule backups of XI
Location can be local, FTP, SSH
Remote location recommended. Storing them on storage that is not local to the XI file system is important - make sure you can get to your backups if your XI server dies.
Manual Backups
Local Backup Archives via Admin menu
/usr/local/nagiosxi/scripts/backup_xi.sh
The backup and restore procedure is very straight-forward and allows for a full recovery of your Nagios XI system.
Another good use of it is to migrate XI from one server to another.
Refer to the XI Backup and Restore procedure.
Final Thoughts
For any support related questions please visit the Nagios Support Forums at:
http://support.nagios.com/forum/
Article ID: 503
Created On: Mon, May 2, 2016 at 11:49 PM
Last Updated On: Mon, Apr 11, 2022 at 12:47 PM
Authored by: tlea
Online URL: https://support.nagios.com/kb/article/nagios-xi-xi-server-considerations-503.html