Page 1 of 2

100%+ used memory

Posted: Tue Jun 05, 2018 11:51 am
by altsysrq
I am using custom_check_mem to get memory usage of our systems. We use the "-n" switch, which is supposed to not include cached memory in the total used. This is documented here:
https://support.nagios.com/kb/article/m ... s-774.html
If you do not want the cached memory to be part of the thresholds calculations the -n argument is used:
Unfortunately, this does not seem to be the case. Even the example that is given in the above document seems to include cached memory. In our situation, we end up with cached being added to total memory used and often exceed 100% of memory used. For example:

Code: Select all

root@pxe1:~# /usr/lib/nagios/plugins/custom_check_mem -w 20 -c 10 -n
 OK - 6875 / 3951 MB (174%) Free Memory, Used: 335 MB, Shared: 40 MB, Buffers: 3027 MB, Cached: 3259 MB | total=3951MB free=6875MB used=335MB shared=40 buffers=3027MB cached=3259MB

root@pxe1:~# /usr/lib/nagios/plugins/custom_check_mem -w 20 -c 10
 OK - 3617 / 3951 MB (91%) Free Memory, Used: 334 MB, Shared: 40 MB, Buffers: 3027 MB, Cached: 3259 MB | total=3951MB free=3617MB used=334MB shared=40 buffers=3027MB cached=3259MB
This seems to be a more recent issue as originally we were including -n to avoid cached memory usage from causing low memory errors from occurring. Am I mistakenly using this setting? Is it possible a Linux update may cause the memory to be calculated differently? Any other explanations are appreciated.

[edit] - not a php file

Re: 100%+ used memory

Posted: Tue Jun 05, 2018 1:00 pm
by scottwilkerson
Could you attach your copy of /usr/lib/nagios/plugins/custom_check_mem

also, what OS is this?

Re: 100%+ used memory

Posted: Tue Jun 05, 2018 1:59 pm
by altsysrq
Apologies, here is the file (extension not allowed):

Code: Select all

#!/bin/bash
# Script to check real memory usage
# L.Gill 02/05/06 - V.1.0
# ------------------------------------------
# ########  Script Modifications  ##########
# ------------------------------------------
# Who	 When	   What
# ---    ----      ----
# LGill	 17/05/06  "$percent" lt 1% fix - sed edits dc result beggining with "."
#
#
#!/bin/bash
USAGE="`basename $0` [-w|--warning]<percent free> [-c|--critical]<percent free> [-n|--nocache]"
THRESHOLD_USAGE="WARNING threshold must be greater than CRITICAL: `basename $0` $*"
calc=/tmp/memcalc
percent_free=/tmp/mempercent
critical=""
warning=""
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3
nocache=0
# print usage
if [[ $# -lt 4 ]]
then
	echo ""
	echo "Wrong Syntax: `basename $0` $*"
	echo ""
	echo "Usage: $USAGE"
	echo ""
	exit 0
fi
# read input
while [[ $# -gt 0 ]]
  do
        case "$1" in
               -w|--warning)
               shift
               warning=$1
        ;;
               -c|--critical)
               shift
               critical=$1
        ;;

               -n|--nocache)
               nocache=1
        ;;
        esac
        shift
  done
# verify input
if [[ $warning -eq $critical || $warning -lt $critical ]]
then
	echo ""
	echo "$THRESHOLD_USAGE"
	echo ""
        echo "Usage: $USAGE"
	echo ""
        exit 0
fi

memoutput=`free -m| head -2 | tail -1`

# Total memory available
#total=`free -m | head -2 |tail -1 |gawk '{print $2}'`
total=`echo $memoutput | gawk '{print $2}'`
# Total memory used
#used=`free -m | head -2 |tail -1 |gawk '{print $3}'`
used=`echo $memoutput | gawk '{print $3}'`
# Calc total minus used
#free=`free -m | head -2 |tail -1 |gawk '{print $2-$3}'`
if [ "$nocache" -eq "1" ]; then
    free=`echo $memoutput | gawk '{print $2-$3+$7}'`
else
    free=`echo $memoutput | gawk '{print $2-$3}'`
fi

shared=`echo $memoutput | gawk '{print $5}'`
buffers=`echo $memoutput | gawk '{print $6}'`
cached=`echo $memoutput | gawk '{print $7}'`
# free=$free-$cached
# normal values
#echo "$total"MB total
#echo "$used"MB used
#echo "$free"MB free

# make it into % percent free = ((free mem / total mem) * 100)
echo "5" > $calc # decimal accuracy
echo "k" >> $calc # commit
echo "100" >> $calc # multiply
echo "$free" >> $calc # division integer
echo "$total" >> $calc # division integer
echo "/" >> $calc # division sign
echo "*" >> $calc # multiplication sign
echo "p" >> $calc # print
percent=`/usr/bin/dc $calc|/bin/sed 's/^\./0./'|/usr/bin/tr "." " "|/usr/bin/gawk {'print $1'}`
#percent1=`/usr/bin/dc $calc`
#echo "$percent1"
if [[ "$percent" -le  $warning ]]
        then
    string="WARNING"
    result=1
fi
if [[ "$percent" -le  $critical ]]
    then
    string="CRITICAL"
    result=2
fi
if [[ "$percent" -gt  $warning ]]
    then
    string="OK"
    result=0
fi

echo "$string - $free / $total MB ($percent%) Free Memory, Used: $used MB, Shared: $shared MB, Buffers: $buffers MB, Cached: $cached MB | total="$total"MB free="$free"MB used="$used"MB shared="$shared"$MB buffers="$buffers"MB cached="$cached"MB"
exit $result
This is on an Ubuntu 16.04 system.

Re: 100%+ used memory

Posted: Tue Jun 05, 2018 3:01 pm
by scottwilkerson
I just re-checked how this plugin works. It looks like on Ubuntu the 7th column of the command this plugin uses returns total available memory instead of the cached memory.

I'll submit a bug report but I don't have a fix for that at this time

Re: 100%+ used memory

Posted: Tue Jun 05, 2018 3:02 pm
by npolovenko
@altsysrq, Please open the plugin with a text editor and change:

Code: Select all

if [ "$nocache" -eq "1" ]; then
    free=`echo $memoutput | gawk '{print $2-$3+$7}'`
to

Code: Select all

if [ "$nocache" -eq "1" ]; then
    free=`echo $memoutput | gawk '{print $2-$3-$7}'`
So instead of +$7 there should be -$7.
Let me know if this fixes the problem.

Re: 100%+ used memory

Posted: Tue Jun 05, 2018 4:54 pm
by altsysrq
That won't work. The command 'free' in Ubuntu 16.04 uses procps-ng 3.3.10 and doesn't calculate cache in the amount of memory used. This is explained through the man pages for used memory in 16.04:
used Used memory (calculated as total - free - buffers - cache)
Additionally, the script custom_check_mem looks for column 7 to be cache. In 16.04 that information is in column 6 (in 14.04 it is in column 7).

Ubuntu 16.04 (free from procps-ng 3.3.10):

Code: Select all

root@pxe1:~# free -m
              total        used        free      shared  buff/cache   available
Mem:           3951         334         529          40        3087        3257
Swap:          4091           0        4091
Ubuntu 14.04 (free from procps-ng 3.3.9):

Code: Select all

root@cacher:~# free -m
             total       used       free     shared    buffers     cached
Mem:          2000       1654        346          0        213       1068
-/+ buffers/cache:        372       1628
Swap:         1019          2       1017
The difference seems to be with two versions of 'free' from the procps-ng project/package. (Redhat 7 uses procps-ng 3.3.10, so there is a problem there too.)

If you can think of quick solutions to this let me know.

Re: 100%+ used memory

Posted: Wed Jun 06, 2018 11:01 am
by npolovenko
@altsysrq, Oh, I see...In this case, we'd probably have to stick with the bug report.

Re: 100%+ used memory

Posted: Wed Jun 06, 2018 1:34 pm
by altsysrq
For reference, on an Ubuntu 18.04 test system:

Code: Select all

root@test:~# free -V
free from procps-ng 3.3.12
root@test:~# free -m
              total        used        free      shared  buff/cache   available
Mem:            962         179         465           2         317         634
Swap:          1923           0        1923

Re: 100%+ used memory

Posted: Wed Jun 06, 2018 2:14 pm
by scottwilkerson
altsysrq wrote:For reference, on an Ubuntu 18.04 test system:

Code: Select all

root@test:~# free -V
free from procps-ng 3.3.12
root@test:~# free -m
              total        used        free      shared  buff/cache   available
Mem:            962         179         465           2         317         634
Swap:          1923           0        1923
Thanks

Re: 100%+ used memory

Posted: Thu Jun 14, 2018 4:26 pm
by altsysrq
I think we are settling on this code as a simple work around. We are using Puppet to deploy the update so it should be pretty smooth. Posting it here to help others and get feedback.

Replace lines 74-78 with:

Code: Select all

free_version=`free -V | gawk '{print $4}'`
free_requiredver=3.3.10

if [ "$(printf '%s\n' "$free_requiredver" "$free_version" | sort -V | head -n1)" = "$free_requiredver" ]; then
       #echo "Greater than or equal to $free_requiredver"
       if [ "$nocache" -eq "1" ]; then
         #echo "nocache"
         free=`echo $memoutput | gawk '{print $2-$3}'`
       else
         #echo "cache"
         free=`echo $memoutput | gawk '{print $2-$3-$6}'`
       fi
else
       #echo "Less than $free_requiredver"
       if [ "$nocache" -eq "1" ]; then
         #echo "nocache"
         free=`echo $memoutput | gawk '{print $2-$3+$7}'`
       else
         #echo "cache"
         free=`echo $memoutput | gawk '{print $2-$3}'`
       fi
fi
Let me know what you think or if you have any questions.

[edit] source is from here: https://unix.stackexchange.com/a/285928