Re: Performance and Crash
Posted: Fri Aug 02, 2019 11:09 am
Well folks... I made a break-through last night.
It occurred to me that my vSphere cluster had been getting more loaded in the past few months, so... on a lark...
I updated the CPU "Shares" from Normal to High.
The difference was almost immediate.
I don't exactly understand the relationship or how this could impact Nagios in the way it has... but since I made that change last night I have not received a single false notification from Nagios XI. CPU usage now an average of 75%.
This morning, I made further changes to reserve vSphere CPU based on the average utilization (in this case 5000MHz).
Average load on the server dropped even further.
I would really love to understand why Nagios experiences such dramatic timeouts because of this change.
In any case, thank you for trying to help with this and reviewing my configuration. I think I'm on the road to recovery.
4768 services, 407 hosts, most service checks run in 10-minute intervals. host checks run at 5-minute intervals.
4 cores. CPU Share High. CPU reservation 5000MHz.
total CPU utilization averaging around 50%
GUI responds well.
It occurred to me that my vSphere cluster had been getting more loaded in the past few months, so... on a lark...
I updated the CPU "Shares" from Normal to High.
The difference was almost immediate.
I don't exactly understand the relationship or how this could impact Nagios in the way it has... but since I made that change last night I have not received a single false notification from Nagios XI. CPU usage now an average of 75%.
This morning, I made further changes to reserve vSphere CPU based on the average utilization (in this case 5000MHz).
Average load on the server dropped even further.
I would really love to understand why Nagios experiences such dramatic timeouts because of this change.
In any case, thank you for trying to help with this and reviewing my configuration. I think I'm on the road to recovery.
4768 services, 407 hosts, most service checks run in 10-minute intervals. host checks run at 5-minute intervals.
4 cores. CPU Share High. CPU reservation 5000MHz.
total CPU utilization averaging around 50%
GUI responds well.