Nagios XI stability issues
Posted: Thu Mar 19, 2020 5:39 pm
Hello,
System: Centos 7.7, 8 cpu, 16gb, enough disk space
Total checks: 8000
I have been having Nagios XI stability issues and am trying to figure out what needs to be done or changed to get it more stable.
I have setup a max concurrent jobs to 60, I have repaired the databases. Tweaked parameters for a large setup (like reaper frequency/time). Setup to use unified tactical overview, increased the refresh multiplier by 10 times the default, disabled auto-running reports and metrics on page load. The ndo2db has been showing "max retries exceeded", so i setup the msgmni value.
Its still not stable.. i am also seeing the wproc related messages.
nagios[26891]: wproc: iocache_read() from Core Worker 26901 returned -1: Bad address
I see the monitoring engine stop every midnight. I have seen it stop randomly during the day.
I am not sure what my best approach is to get this stable.
Thank you,
Vinod
System: Centos 7.7, 8 cpu, 16gb, enough disk space
Total checks: 8000
I have been having Nagios XI stability issues and am trying to figure out what needs to be done or changed to get it more stable.
I have setup a max concurrent jobs to 60, I have repaired the databases. Tweaked parameters for a large setup (like reaper frequency/time). Setup to use unified tactical overview, increased the refresh multiplier by 10 times the default, disabled auto-running reports and metrics on page load. The ndo2db has been showing "max retries exceeded", so i setup the msgmni value.
Its still not stable.. i am also seeing the wproc related messages.
nagios[26891]: wproc: iocache_read() from Core Worker 26901 returned -1: Bad address
I see the monitoring engine stop every midnight. I have seen it stop randomly during the day.
I am not sure what my best approach is to get this stable.
Thank you,
Vinod