Page 1 of 1
Performance issue after kernel upgrade
Posted: Fri Jan 05, 2018 4:06 pm
by reincarne
Hi,
With the recent Intel hardware security issues, we did the required patches. However, as it was expected, we faced very hard performance issues (load spiked to 30%-40%). Maybe you already faced such issues in the last two days and you have any suggestion?
Thanks in advance.
Re: Performance issue after kernel upgrade
Posted: Mon Jan 08, 2018 10:43 am
by kyang
I have seen the news about the Intel Security updates. I'm not entirely sure, but it's certainly possible Intel will be looking into this if everyone is having performance issues.
For this matter, are you still seeing this performance spikes now? Are you running a VM from Windows or a standalone server?
Where is the load coming from, a specific process? What's the output of top? How about any notable logs or is this mainly an "Intel update" thing?
Re: Performance issue after kernel upgrade
Posted: Mon Jan 08, 2018 11:58 am
by SDK
Hi,
from what i know Linux is implementing KPTI (Kernel Page Table Isolation) to mitigate the Meltdown variant of the exploit.
This will in general don't cause significant performance degradation if the processes on the system mostly stay in userspace.
Due to how Nagios operates with a lot of processes, scripts, runtimes being spawned to execute checks i suspect there are
a lot of syscalls happening.
I will quote wikipedia here:
KPTI fixes these leaks by separating user-space and kernel-space page tables entirely. On processors that support the process-context identifiers (PCID), a translation lookaside buffer (TLB) flush can be avoided,[4] but even then it comes at a significant performance cost, particularly in syscall-heavy and interrupt-heavy workloads
This is the reason i haven't upgrades the OS in our environment. We can't afford to slow down our Nagios System.
Kind regards
Re: Performance issue after kernel upgrade
Posted: Mon Jan 08, 2018 1:47 pm
by kyang
Thanks for the input
@SDK.
Re: Performance issue after kernel upgrade
Posted: Mon Jan 08, 2018 5:28 pm
by yo_marc
reincarne, Can you tell us how many Hosts and Services you are monitoring? And perhaps some hardware info regarding your Nagios server?
I'm just a fellow admin. I patched a 5.4.11 XI Linux server today running a very light load of about 200 host and services combined. (Centos 7 VM, 8 Intel Xeon cores, 8gb RAM). This particular server hasn't shown any hint of slowness. It would be helpful to hear what load and hardware specs might see some.
Thanks!
Re: Performance issue after kernel upgrade
Posted: Mon Jan 08, 2018 5:50 pm
by dwhitfield
Obviously we're duck-taping a hole in the boat, but taking a look at
https://assets.nagios.com/downloads/nag ... ios-XI.pdf is better than nothing until Intel gets their chip together.
Re: Performance issue after kernel upgrade
Posted: Sun Jan 14, 2018 9:34 am
by reincarne
Hi,
For those who are using AWS service - we solved it a day after by creating HVM machine and we are stable since then.
About number of hosts - 1700
Services - 30000
Re: Performance issue after kernel upgrade
Posted: Mon Jan 15, 2018 12:02 pm
by dwhitfield
@reincarne, as OP, do you think this is ready to lock up? If not, what other questions do you have?