Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
If you think there is a bug in NCPA please post it on Github so that I can take a look at it. However, I do believe both of these values are correct. Just remember that NCPA gives the total used percent (via the psutil module in python) and is most likely calculated via amount used - cache - buffer or something along those lines. Calculating the percent used (or free) on Linux is really quite frustrating (see this website) since different places use different types of calculations to include or exclude buffered and cached memory. Since NRPE is using plugins, what plugin is it using to give you the free amount?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
I guess we need to audit all the plugins plus NRPE to see that they're using the same methodologies. If NRPE's generic install says X%, but NCPA's generic install says Y% on the same system, then the two are not interchangeable and will affect the thresholds and templates that we deploy for customers. This is all based on our testing of converting NRPE-based checks to NCPA-based checks in advance of doing this for a few tens of thousands of service checks for customers.
Here's my updates. Yes, I know about the "what is free memory" in Linux problem, so I calculated my own. Using three commands done in quick succession on localhost, I checked memory via NCPA (curl), NRPE (ran the custom_check_mem module) and "free -m" (which is actually what the custom_check_mem is doing). Here's the results:
First, "free -m":
I can calculate that the total memory is 1006MB, there is 512MB available (but partially allocated to cache and buffers) and that there is 961MB total in use, of which 494MB is in use by buffers and cache. This gives me 4.5% completely free memory that is doing nothing, but 50.9% available that can be put to use by draining cache and/or buffers. Only 46.4% of memory is allocated to OS and applications.
NCPA:
It comes back with /api/memory/virtual/percent as 49.2%. This is correct if one measures available free memory. Generally speaking, this is what people think of as "memory that can be used for something" and includes caches and buffers. While /api/memory/virtual/free and /api/memory/virtual/total can be used to calculate a percentage (in this case, 4.4%) of absolutely free memory that is not in use by anything, this is not what "percent" returns. It is returning .../available divided by .../total, instead. This makes intuitive sense, and calculates out as 50.8% in this case. Exactly what it should be if /api/memory/vritual/percent is 49.2% and in line with my "free -m" sample from above.
NRPE (aka "custom_check_mem):
This comes back with 4% free memory. This is correct only if you count memory that is not allocated for anything at all. Meaning, it considers buffers and caches to be in-use memory. This is the same as NCPA's .../free divided by .../total.
Conclusion:
The problem is that both custom_check_mem and NCPA are returning correct values for what they're measuring, but they're measuring different things. NCPA is checking available memory while custom_check_mem is checking free memory. They are not the same thing and they have an order of magnitude difference between them. I like having options, but I think if I change from one standard memory check (say, NRPE's custom_check_mem) to another (like NCPA's check) that I would expect my checks to be the same thing and to report the same numbers. Otherwise, my thresholds, capacity planning, and perf data are going to show weird results.
Is it possible to get percent_free added to NCPA to be able to get the same results as the NRPE check? I'll reference this forum post in my github ticket.
Is it possible to get percent_free added to NCPA to be able to get the same results as the NRPE check? I'll reference this forum post in my github ticket.