CPU_load

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
kabamaru
Posts: 11
Joined: Thu Sep 14, 2023 4:45 am

CPU_load

Post by kabamaru »

Hello

I'm puzzled by the following.

I'm trying to monitor the CPU load of a server and get a warning when the average CPU load is over 80% and critical when it's over 90%. To do this I added this line on the host's nrpe.cfg

command[check_load]=/usr/local/nagios/libexec/check_load -r -w 38.4,38.4,38.4 -c 43.2,43.2,43.2

I got these values using the y = c * p / 100 formula. Where c is the number of CPUs and p the desired threshold percentage.
The server has 48 CPUs, so if I want the value for 80%, I would use 48*(8/10)=38.4

Nagios is reporting this:

CPU OK 01-15-2024 16:19:27 0d 1h 7m 11s 1/3 OK - load average per CPU: 1.02, 1.02, 1.02

But when I do uptime on the server I can see that the load average is load average: 49.65, 49.12, 48.91

I'm confused. Something I don't fully understand here. Shouldn't the load average match the one displayed by uptime?

Many thanks for you help.
jsimon
Posts: 104
Joined: Wed Aug 23, 2023 11:27 am

Re: CPU_load

Post by jsimon »

Hi @kabamaru,

Looking at the documentation for the check_load plugin, it looks like the "-r" parameter you are using tells the command to "Divide the load averages by the number of CPUs (when possible)" -- Given that you have 48 CPUs, is it possible that would account for the discrepancy?
kabamaru
Posts: 11
Joined: Thu Sep 14, 2023 4:45 am

Re: CPU_load

Post by kabamaru »

Hi jsimon

Thank you for your time. You are right.
I have removed the -r and I can see that the values nagios is reporting are very similar to the ones that uptime outputs.
They can't be exactly the same because they are constantly changing.

Thank you so much for your help.
All the best
jsimon
Posts: 104
Joined: Wed Aug 23, 2023 11:27 am

Re: CPU_load

Post by jsimon »

I'm glad we were able to figure that out for you! I'll go ahead and lock the issue then.
Locked