Help to make check_snmp_temperature work

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
kCyborg
Posts: 4
Joined: Tue Mar 16, 2021 7:49 am

Help to make check_snmp_temperature work

Post by kCyborg »

Hi, I recently installed Nagios-Core 4.4.6 in order to monitor a lots of virtual servers on my network, all works flawlessly, I'm able to check pretty most of the stats/services I wanna check. But I would like to, also, check the temperature of a physical server (wich is a Dell server running Proxmox on top of Debian10) and of course show the temperature on the Nagios's local web page.

I found this amazing plugin check_snmp_temperature https://exchange.nagios.org/directory/P ... re/details but, altought it seems very easy to use, I can't make it work, I keep getting this error message:

Code: Select all

server:/usr/local/nagios/etc# /usr/lib/nagios/plugins/./check_snmp_temperature.pl -H 192.168.50.230 -C public -T dell -d .1.3.6.1.4.1.28402.3.3.3.1.5.1 -a'.' -o C -w 30 -c 35

Please either specify specify system type (-T) OR base SNMP OIDs for name (-N) and data (-D) tables OR exact list of sensor names (-n) and data OIDs (-d) !
Usage: /usr/lib/nagios/plugins/./check_snmp_temperature.pl [-v] -H <host> -C <snmp_community> [-2] | (-l login -x passwd [-X pass -L <authp>,<privp>])  [-p <port>] [-t <timeout>] -T dell|hp|cisco1|juniper|alteon|lmsensors | [-N <oid_attribnames> -D <oid_attribdata>] | [-n <list of sensor names> -d <list of sensor oids>] [-a <attributes to check> -w <warn levels> -c <crit levels> [-f]] [-A <attributes for perfdata>] [-o <out_temp_unit: C|F|K>] [-i <in_temp_unit>] [-u <unknown_default>] [-V]
(I also tried with changing the -H option and setting the local IP 127.0.0.1, but I get the same answer)

I know I need to check this part:
Please either specify specify system type (-T) OR base SNMP OIDs for name (-N) and data (-D) tables OR exact list of sensor names (-n) and data OIDs (-d) !
But I dig into the Git page https://github.com/willixix/WL-NagiosPl ... erature.pl and apparently I'm tooo way dumb to find what I'm doing wrong.

Could you help me?
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Help to make check_snmp_temperature work

Post by mcapra »

I read this as an exclusive or:

Code: Select all

Please either specify specify system type (-T) OR base SNMP OIDs for name (-N) and data (-D) tables OR exact list of sensor names (-n) and data OIDs (-d) !
And the code verifies this behavior:
https://github.com/willixix/WL-NagiosPl ... #L580-L582

You're defining $o_type (the -T argument) in addition to one of either:
  • $oid_names (-N)
  • $oid_data (-D)
  • $o_sensornames (-n)
  • $o_sensorids (-d)
In your case, you're defining -T in addition to -d, which violates the exclusive or mentioned in the error message.

As to why that exclusive or exists, I imagine because when the plugin is passed the -T argument, it attempts to map the provided string to a set of pre-defined values based on whatever the -T value is:
https://github.com/willixix/WL-NagiosPl ... #L243-L252

So if you pass -T dell, it's going to plug these in automatically based on the map I linked above and some default values:

Code: Select all

-N 1.3.6.1.4.1.674.10892.1.700.20.1.8
-D 1.3.6.1.4.1.674.10892.1.700.20.1.6
-i 10C
According to this plugin, the $o_sensoroids (-d) value is useless without the $o_sensornames (-n) value. Presumably because it's trying to match the provided oids -d to the provided names -n.

I dunno anything about this plugin for the record. I'm just interpreting the Perl. Try removing the -d argument and see what you get.
Former Nagios employee
https://www.mcapra.com/
kCyborg
Posts: 4
Joined: Tue Mar 16, 2021 7:49 am

Re: Help to make check_snmp_temperature work

Post by kCyborg »

mcapra wrote: I dunno anything about this plugin for the record. I'm just interpreting the Perl. Try removing the -d argument and see what you get.
Hi mate, thanks very much for your answer, I tried removing the -d option, and I get a different error:

Code: Select all

ERROR: Alarm signal (Nagios time-out)
For whatever it serves, this is how I defined the service in localhost.cfg (a similar definition is on the phisical server Im trying to monitor):

Code: Select all

define service {

    use                     local-service
    host_name                   localhost
    service_description      Temperature
    check_command           check_temp!CPU,Ambient,Bottom!110,90,0!135,110,0
    notifications_enabled   1
}
And the commands.cfg:

Code: Select all

define command{

    command_name    check_temp
    command_line       $USER1$/check_snmp_temperature.pl -H $HOSTADDRESS$ -C public -N .1.3.6.1.4.1.674.10892.1.700.20.1.8 -i 10C -o F -u 0 -a ARG1$ -w $ARG2$ -c $ARG3$ -f

}
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Help to make check_snmp_temperature work

Post by benjaminsmith »

Hi @kCyborg

Hi mate, thanks very much for your answer, I tried removing the -d option, and I get a different error:
CODE: SELECT ALL
ERROR: Alarm signal (Nagios time-out)
I would recommend trying that again but increase the timeout this time. The default is only 5 seconds. Also, add the --verbose option for extra debugging output.

Plugin Options:
./check_snmp_temperature.pl -h

SNMP Temperature Monitor for Nagios version 0.34
by William Leibzon - william(at)leibzon.org

Usage: ./check_snmp_temperature.pl [-v] -H <host> -C <snmp_community> [-2] | (-l login -x passwd [-X pass -L <authp>,<privp>]) [-p <port>] [-t <timeout>] -T dell|hp|cisco1|juniper|alteon | [-N <oid_attribnames> -D <oid_attribdata>] | [-n <list of sensor names> -d <list of sensor oids>] [-a <attributes to check> -w <warn levels> -c <crit levels> [-f]] [-A <attributes for perfdata>] [-o <out_temp_unit: C|F|K>] [-i <in_temp_unit>] [-u <unknown_default>] [-V]
-v, --verbose
print extra debugging information
-h, --help
print this help message
-H, --hostname=HOST
name or IP address of host to check
-C, --community=COMMUNITY NAME
community name for the host's SNMP agent (implies v 1 protocol)
-2, --v2c
Use snmp v2c
-l, --login=LOGIN ; -x, --passwd=PASSWD
Login and auth password for snmpv3 authentication
If no priv password exists, implies AuthNoPriv
-X, --privpass=PASSWD
Priv password for snmpv3 (AuthPriv protocol)
-L, --protocols=<authproto>,<privproto>
<authproto> : Authentication protocol (md5|sha : default md5)
<privproto> : Priv protocole (des|aes : default des)
-P, --port=PORT
SNMP port (Default 161)
-w, --warn=INT[,INT[,INT[..]]]
warning temperature level(s) (if more then one attribute is checked, must have multiple values)
-c, --crit=INT[,INT[,INT[..]]]
critical temperature level(s) (if more then one attribute is checked, must have multiple values)
-f, --perfdata
Perfparse compatible output
-t, --timeout=INTEGER
timeout for SNMP in seconds (Default: 5)
-V, --version
prints version number
-N, --oidtable_attribnames=OID_STRING
Base table OID to walk through to find names of those attributes supported and from that corresponding data OIDs
-D, --oidtable_attribdata=OID_STRING
Base table OID for sensor attribute data, one number is added to that to make up full attribute OID
-n, --sensor_names=STRING[,STRING[..]]
List of sensor names when -N is not used and sensors are specified with exeact oids
-d, --sensor_oids=OID_STRING[,OID_STRING[..]]
List of exact data OIDs for sensors specified with -n (specify this when -N and -D are not used)
-a, --attributes=STRING[,STRING[..]]
Which attribute(s) to check. This is used as regex to check if attribute is found in sensor names.
As an example for Dell the attribute names to use are: PROC_1, PROC_2, Ambient, Planar, Riser
-A, --perf_attributes=STRING[,STRING[..]]
Which attribute(s) to add to as part of performance data output. These names can be different then the
ones listed in '-a' to only output attributes in perf data but not check. Special value of '*' gets them all.
-f, --perfparse
Used only with '-a'. Causes to output data not only in main status line but also as perfparse output
-o --out_temp_unit=C|F|K
What temperature measurement units are used for output and warning/critical - 'C', 'F' or 'K' - default is 'C'
-i --in_temp_unit=[num]C|F|K
What temperature measurement reported by data OID - format is <num>C|F|K (default is 'C')
where num is used if data is num*realdata, i.e. if reported data of 330 means 33C, then it is: -i 10C
-u, --unknown_default=INT
If attribute is not found then report the output as this number (i.e. -u 0)
-T, --type=dell|hp|cisco1|juniper|alteon
This allows to use pre-defined system type to set Base, Data OIDs and incoming temperature measurement type
Currently support systems types are: dell, hp, cisco1 (7500, 5500, 2948, etc), juniper, alteon
Let us know how what you find out.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked