Page 1 of 1

oom kill nagios

Posted: Tue Jan 29, 2019 10:03 am
by mon-team
Hello Support,
in the last two months we had several issues with the OOM killer. I read some other topic on this case but i didn't find something useful to troubleshoot the issue.
Here below some lines from the /var/log/messages of the last crash:

kern.warning: Jan 28 19:14:25 tor1bld0421u kernel:httpd invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0, oom_score_adj=0
kern.warning: Jan 28 19:14:33 tor1bld0421u kernel:php invoked oom-killer: gfp_mask=0x280da, order=0, oom_adj=0, oom_score_adj=0
kern.warning: Jan 28 19:14:34 tor1bld0421u kernel:php invoked oom-killer: gfp_mask=0x280da, order=0, oom_adj=0, oom_score_adj=0
kern.warning: Jan 28 19:14:51 tor1bld0421u kernel:php invoked oom-killer: gfp_mask=0x280da, order=0, oom_adj=0, oom_score_adj=0
kern.warning: Jan 28 19:15:11 tor1bld0421u kernel:nsca invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0, oom_score_adj=0
kern.warning: Jan 28 19:15:12 tor1bld0421u kernel:php invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0, oom_score_adj=0

In the same minutes also mysqld and postregsql crashed. We have the embedded perl disabled, NSCA at 2.9version, used RAM doesn't exceed the 3GB over 16GB available.
We are running Nagios XI 2014 R.2.7 on a CentOS 6.6, Nagios Core Version 4.0.8.
What would you suggest us?
Thanks,
Francesco

Re: oom kill nagios

Posted: Tue Jan 29, 2019 3:18 pm
by scottwilkerson
mon-team wrote:used RAM doesn't exceed the 3GB over 16GB available.
This may be what the last reading was, but if the oom-killer was invoked, the system was out of RAM.

As a first step I would highly recommend rebooting the server if you have not already.

Next is going to be the trickier part of trying to decipher what is using all the RAM at the precise moment before the oom-killer is invoked.