custom_check_mem reports incorrect data & status to NagiosXI
-
abhijitderle
- Posts: 6
- Joined: Fri May 31, 2019 1:31 am
custom_check_mem reports incorrect data & status to NagiosXI
Hello there,
NagiosXI - v5.6.2
Nagios Core - v4.4.3
I noticed a weird behavior when monitoring Memory for an RHEL 7.x host from NagiosXI server through NRPE and default check_mem command (available from /usr/local/nagios/etc/nrpe/common.cfg).
When I run the script locally on the host it reports data correctly as shown below:
[root@xxxxx ~]# grep check_mem /usr/local/nagios/etc/nrpe/common.cfg
command[check_linux_mem]=/usr/local/nagios/libexec/custom_check_mem $ARG1$
[root@xxxxx ~]#
[root@xxxxx ~]# /usr/local/nagios/libexec/custom_check_mem -w 20 -c 10 -n
OK - 11432 / 15836 MB (72%) Free Memory, Used: 3873 MB, Shared: 88 MB, Buffers + Cached: 11678 MB | total=15836MB free=11432MB used=3873MB shared=88MB buffers_and_cached=11678MB
[root@xxxxx ~]#
But when I call this command from NagiosXI server, it reports critical status for memory and does not include percentage value, as shown below:
[root@yyyyy ~]#
[root@yyyyy ~]# /usr/local/nagios/libexec/check_nrpe -H pnzul010.ad.infosys.com
NRPE v3.2.1
[root@yyyyy ~]#
[root@yyyyy ~]# /usr/local/nagios/libexec/check_nrpe -H xxxxx -c check_linux_mem -a '-w 20 -c 10 -n'
CRITICAL - 11432 / 15836 MB (%) Free Memory, Used: 3874 MB, Shared: 88 MB, Buffers + Cached: 11678 MB | total=15836MB free=11432MB used=3874MB shared=88MB buffers_and_cached=11678MB
[root@yyyyy ~]#
This is further leading to incorrect status of Memory utilization and further generating alerts to the admins.
Can anyone please help here? Anyone faced similar issue earlier?
NagiosXI - v5.6.2
Nagios Core - v4.4.3
I noticed a weird behavior when monitoring Memory for an RHEL 7.x host from NagiosXI server through NRPE and default check_mem command (available from /usr/local/nagios/etc/nrpe/common.cfg).
When I run the script locally on the host it reports data correctly as shown below:
[root@xxxxx ~]# grep check_mem /usr/local/nagios/etc/nrpe/common.cfg
command[check_linux_mem]=/usr/local/nagios/libexec/custom_check_mem $ARG1$
[root@xxxxx ~]#
[root@xxxxx ~]# /usr/local/nagios/libexec/custom_check_mem -w 20 -c 10 -n
OK - 11432 / 15836 MB (72%) Free Memory, Used: 3873 MB, Shared: 88 MB, Buffers + Cached: 11678 MB | total=15836MB free=11432MB used=3873MB shared=88MB buffers_and_cached=11678MB
[root@xxxxx ~]#
But when I call this command from NagiosXI server, it reports critical status for memory and does not include percentage value, as shown below:
[root@yyyyy ~]#
[root@yyyyy ~]# /usr/local/nagios/libexec/check_nrpe -H pnzul010.ad.infosys.com
NRPE v3.2.1
[root@yyyyy ~]#
[root@yyyyy ~]# /usr/local/nagios/libexec/check_nrpe -H xxxxx -c check_linux_mem -a '-w 20 -c 10 -n'
CRITICAL - 11432 / 15836 MB (%) Free Memory, Used: 3874 MB, Shared: 88 MB, Buffers + Cached: 11678 MB | total=15836MB free=11432MB used=3874MB shared=88MB buffers_and_cached=11678MB
[root@yyyyy ~]#
This is further leading to incorrect status of Memory utilization and further generating alerts to the admins.
Can anyone please help here? Anyone faced similar issue earlier?
Re: custom_check_mem reports incorrect data & status to Nagi
Are you sure that you are running the check against the same machine?
On the remote machine, the percentage is shown correctly:
On the remote machine, the percentage is shown correctly:
but in the second check, it's not... Usually, when you see a missing percentage:OK - 11432 / 15836 MB (72%) Free Memory
this is caused by a missing package, e.g. bc or dc. The issue can be solved by installing the package on the client machine:CRITICAL - 11432 / 15836 MB (%) Free Memory
Code: Select all
yum install bc -yBe sure to check out our Knowledgebase for helpful articles and solutions!
-
abhijitderle
- Posts: 6
- Joined: Fri May 31, 2019 1:31 am
Re: custom_check_mem reports incorrect data & status to Nagi
Before creating this post, I had read is some old threads about this bc package dependency and I ensured that it is already installed on the client system.
[root@xxxxx ~]#
[root@xxxxx ~]# yum list bc
Loaded plugins: enabled_repos_upload, langpacks, package_upload, product-id, search-disabled-repos, subscription-manager
Installed Packages
bc.x86_64 1.06.95-13.el7 @anaconda/7.5
Uploading Enabled Repositories Report
Loaded plugins: langpacks, product-id, subscription-manager
[root@xxxxx ~]#
Additionally, to rule out a client system specific issue, I tested this scenario on another RHEL client and faced similar issue there as well.
Regards,
Abhijit
[root@xxxxx ~]#
[root@xxxxx ~]# yum list bc
Loaded plugins: enabled_repos_upload, langpacks, package_upload, product-id, search-disabled-repos, subscription-manager
Installed Packages
bc.x86_64 1.06.95-13.el7 @anaconda/7.5
Uploading Enabled Repositories Report
Loaded plugins: langpacks, product-id, subscription-manager
[root@xxxxx ~]#
Additionally, to rule out a client system specific issue, I tested this scenario on another RHEL client and faced similar issue there as well.
Regards,
Abhijit
Re: custom_check_mem reports incorrect data & status to Nagi
This is really strange... It just doesn't make any sense. 
Can you post the nrpe.cfg and common.cfg files from the client on the forum? Please obfuscate sensitive data.
Also run the following commands on the client (remote machine), and show the output in code wraps:
Can you post the nrpe.cfg and common.cfg files from the client on the forum? Please obfuscate sensitive data.
Also run the following commands on the client (remote machine), and show the output in code wraps:
Code: Select all
uname -a
cat /etc/*release
head -4 /usr/local/nagios/libexec/custom_check_mem
which dc
which sed
which tr
which gawk
dc --version
sed --version
tr --version
gawk --versionBe sure to check out our Knowledgebase for helpful articles and solutions!
-
abhijitderle
- Posts: 6
- Joined: Fri May 31, 2019 1:31 am
Re: custom_check_mem reports incorrect data & status to Nagi
Thanks for your revert. Please find below output from client system. I have attached the cfg files to this thread.
Regards,
Abhijit
Code: Select all
[root@xxxxx ~]#
[root@xxxxx ~]# uname -a
Linux xxxxx.test.com 3.10.0-957.21.3.el7.x86_64 #1 SMP Fri Jun 14 02:54:29 EDT 2019 x86_64 x86_64 x86_64 GNU/Linux
[root@xxxxx ~]# cat /etc/*release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.6 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.6"
PRETTY_NAME=RHEL
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.6:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.6
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.6"
Red Hat Enterprise Linux Server release 7.6 (Maipo)
Red Hat Enterprise Linux Server release 7.6 (Maipo)
[root@xxxxx ~]# head -4 /usr/local/nagios/libexec/custom_check_mem
#!/bin/bash
# Script to check real memory usage
# L.Gill 02/05/06 - V.1.0
# ------------------------------------------
[root@xxxxx ~]# which dc
/bin/dc
[root@xxxxx ~]# which sed
/bin/sed
[root@xxxxx ~]# which tr
/bin/tr
[root@xxxxx ~]# which gawk
/bin/gawk
[root@xxxxx ~]# dc --version
dc (GNU bc 1.06.95) 1.3.95
Copyright 1994, 1997, 1998, 2000, 2001, 2004, 2005, 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE,
to the extent permitted by law.
[root@xxxxx ~]# sed --version
sed (GNU sed) 4.2.2
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Jay Fenlason, Tom Lord, Ken Pizzini,
and Paolo Bonzini.
GNU sed home page: <http://www.gnu.org/software/sed/>.
General help using GNU software: <http://www.gnu.org/gethelp/>.
E-mail bug reports to: <[email protected]>.
Be sure to include the word ``sed'' somewhere in the ``Subject:'' field.
[root@xxxxx ~]# tr --version
tr (GNU coreutils) 8.22
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Jim Meyering.
[root@xxxxx ~]# gawk --version
GNU Awk 4.0.2
Copyright (C) 1989, 1991-2012 Free Software Foundation.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see http://www.gnu.org/licenses/.
[root@xxxxx ~]#
Abhijit
You do not have the required permissions to view the files attached to this post.
Re: custom_check_mem reports incorrect data & status to Nagi
There are a few things that you can try.
1. Make sure you have the correct paths in the custom_check_mem plugin on the client. For example, I have this line:
but you may need to change it to this:
When you run the plugin locally, the percent is shown as nagios can find the path. Most probably, when you run the check remotely, the path cannot be found, and the value cannot be calculated.
You can probably recreate the issue locally by changing one of the paths from the above definition to a "non-existent" path
then test your check from the command line (after fixing the paths as stated in step 1):
3. Make sure you have this line in /etc/sudoers on the client:
4. I think, if you have two "-n" flags in the command, the plugin will just ignore one of them, so this is *shouldn't* be a problem. Just wanted to point out though that you are passing "-n" in the arguments:
1. Make sure you have the correct paths in the custom_check_mem plugin on the client. For example, I have this line:
Code: Select all
percent=`/usr/bin/dc $calc|/usr/bin/sed 's/^\./0./'|/usr/bin/tr "." " "|/usr/bin/gawk {'print $1'}`Code: Select all
percent=`/bin/dc $calc|/bin/sed 's/^\./0./'|/bin/tr "." " "|/bin/gawk {'print $1'}`You can probably recreate the issue locally by changing one of the paths from the above definition to a "non-existent" path
and run locally:percent=`xxx/usr/bin/dc $calc|/usr/bin/sed 's/^\./0./'|/usr/bin/tr "." " "|/usr/bin/gawk {'print $1'}`
2. The plugin creates a temp file in /tmp directory - /tmp/memcalc. When you run the plugin locally as root, the temp file is owned by root. Later on, nagios may not have permissions to access the file. Just to be safe, remove the file (it may have some garbage data in it anyway):/usr/local/nagios/libexec/custom_check_mem: line 100: xxx/usr/bin/dc: No such file or directory
CRITICAL - 1679 / 1838 MB (%) Free Memory, Used: 155 MB, Shared: 8 MB, Buffers + Cached: 950 MB | total=1838MB free=1679MB used=155MB shared=8MB buffers_and_cached=950MB
Code: Select all
rm -f /tmp/memcalcCode: Select all
/usr/local/nagios/libexec/check_nrpe -H xxxxx -c check_linux_mem -a '-w 20 -c 10 -n'Code: Select all
Defaults:nagios !requirettyand you are passing it before arguments:-a '-w 20 -c 10 -n'
I am not sure if you changed your command, but this is what you have in the common.cfg file, which is different that what you showed us in your first post:command[check_linux_mem]=/usr/local/nagios/libexec/custom_check_mem -n $ARG1$
[root@xxxxx ~]# grep check_mem /usr/local/nagios/etc/nrpe/common.cfg
command[check_linux_mem]=/usr/local/nagios/libexec/custom_check_mem $ARG1$
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
abhijitderle
- Posts: 6
- Joined: Fri May 31, 2019 1:31 am
Re: custom_check_mem reports incorrect data & status to Nagi
Thanks for your help. I picked option 2 and it helped to capture correct metrics on the NagiosXI server.
A good method or practice to test things, is to run client side commands from nagios account only and not root.
Code: Select all
[root@yyyyy ~]#
[root@yyyyy ~]# /usr/local/nagios/libexec/check_nrpe -H xxxx.test.com -c check_linux_mem -a '-w 20 -c 10'
OK - 11464 / 15836 MB (72%) Free Memory, Used: 3885 MB, Shared: 112 MB, Buffers + Cached: 11757 MB | total=15836MB free=11464MB used=3885MB shared=112MB buffers_and_cached=11757MB
[root@yyyyy ~]#
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: custom_check_mem reports incorrect data & status to Nagi
This is definitely the way to go when debuggingabhijitderle wrote:A good method or practice to test things, is to run client side commands from nagios account only and not root.