Page 1 of 1

WARNING: RLIMIT_NPROC

Posted: Wed Mar 10, 2021 4:20 am
by andyb4u
Hi,

I've been asked to find out out what these messages mean in the event log and if they are anything to be concerned about?

Code: Select all

WARNING: RLIMIT_NPROC is 63448, total max estimated processes is 147241! You should increase your limits (ulimit -u, or limits.conf)

Re: WARNING: RLIMIT_NPROC

Posted: Wed Mar 10, 2021 5:48 pm
by dchurch
This is a warning designed to warn if you're doing a huge number of checks that may overload the system. Here's how it calculates this number.

Linux PIDs have a maximum of 65535 processes (threads) running at once. Processes may fail to start if the processes limit is reached.

I must ask: How many hosts and services are you checking? How often are those checks going out? Every minute? How's the CPU and network usage on the Nagios XI machine (run top)?

You could greatly lessen the load on the system by increasing the delay between checks. It might even be a good idea for checks like disk usage and SSL cert expiration to have a much longer delay between checks -- say 1 hour to 1 day.

Re: WARNING: RLIMIT_NPROC

Posted: Thu Mar 11, 2021 5:09 am
by andyb4u
We have 1508 hosts and 13566 service checks.The majority of the checks are every 5 minutes. Some checks are once a day.

We have an offloaded database and the checks are offloaded to 3 Mod-Gearman worker servers.
top.jpg
Server Statistics.jpg
Monitoring Performance.jpg

Re: WARNING: RLIMIT_NPROC

Posted: Thu Mar 11, 2021 3:55 pm
by dchurch
Since you're using Mod Gearman, the warning isn't about a problem that can happen on your system; Mod Gearman splits the load between multiple servers, but the warning only applies if all checks went out from the same server.

Warnings are just warnings after all. Your system configuration is fine - it doesn't seem like it'll come anywhere near the process limit.

Re: WARNING: RLIMIT_NPROC

Posted: Tue Mar 16, 2021 10:09 am
by andyb4u
That's good news.

Is there a way of increasing the limits, like the warning message says?

My line manager would like to know if there is some way to make the message disappear from our system.

Re: WARNING: RLIMIT_NPROC

Posted: Wed Mar 17, 2021 4:20 pm
by ssax
The calculation is invalid because nagios doesn't run all the checks at the same time and that message should be ignored (this was told to me by the dev that wrote that RLIMIT_NPROC check into Core when I asked him about it).

There isn't currently a way to remove it from your system without increasing your nproc limits to be higher that the value listed, for example, you can add these to your /etc/security/limits.conf:

Code: Select all

root hard nproc 250000
root soft nproc 250000
nagios hard nproc 250000
nagios soft nproc 250000
Then reboot the system to pick up the limit changes and see if that warning is gone.