Page 1 of 1
UPS Alarm State Monitoring
Posted: Mon Jan 31, 2022 2:12 pm
by jameyw
I'm trying to monitor the alarm state of all of my UPS systems. Doing a SNMP walk, I find an OID that shows 0 (zero) for no alarms or 1 (one) if there are active alarms. I can't seem to figure out how to get NagiosXI to alert when alarms are present.
Here is what I have:
Check Command: check_xi_service_snmp
Command View: $USER1$/check_snmp -H $HOSTADDRESS$ $ARG1$
$ARG1$
Code: Select all
-o .1.3.6.1.4.1.2947.1.8.1.0 -C public -P 1 -l "Alarms" -u "Active" -c 1
I have one UPS with an active alarm and NagiosXI is reading a 1 for alarms present but it still shows "OK". I tried check_xi_service_snmp_negate and I get an alert but I also get an alert even if the alarm status is 0.
I feel like this should be simple.
Re: UPS Alarm State Monitoring
Posted: Mon Jan 31, 2022 6:11 pm
by pbroste
Hello
@jameyw
Thanks for reaching out, want to have you run through the command and let us know the results:
Code: Select all
/usr/local/nagios/libexec/check_snmp -H yourhostaddresshere -o .1.3.6.1.4.1.2947.1.8.1.0 -C public -P 1 -l "Alarms" -u "Active" -c 1-4 -s --verbose
adding:
-s, --string=STRING
Return OK state (for that OID) if STRING is an exact match
--verbose to show more details
Please collect the following logs and PM them over:
Code: Select all
tar -czvf /tmp/logresults.txt /usr/local/nagiosxi/var/* /usr/local/nagios/var/nagios.log
Thanks,
Perry
Re: UPS Alarm State Monitoring
Posted: Tue Feb 01, 2022 11:01 am
by jameyw
After looking at the commands you suggested and executing them, I didn't quite get the desired results but with a little tweaking I think I did.
I executed on a UPS with no alarms:
Code: Select all
/usr/local/nagios/libexec/check_snmp -H 10.x.x.5 -o .1.3.6.1.4.1.2947.1.8.1.0 -C public -P 1 -l "Alarms" -u "Active" -c 0 -s --verbose
and it returned: SNMP OK - Alarms 0 Active | Alarms=0Active;;0;
I then tried it on the UPS with alarms active:
Code: Select all
/usr/local/nagios/libexec/check_snmp -H 10.x.x.41 -o .1.3.6.1.4.1.2947.1.8.1.0 -C public -P 1 -l "Alarms" -u "Active" -c 0 -s --verbose
and it returned: SNMP CRITICAL - Alarms *1* Active | Alarms=1Active;;0;
I went into CCM and and changed $ARG1$ to:
Code: Select all
-o .1.3.6.1.4.1.2947.1.8.1.0 -C public -P 1 -l "Alarms" -u "Active" -c 0
and I see the results UPS alarms status of the units as expected.
UPS alarm1.JPG
UPS alarm2.JPG
Re: UPS Alarm State Monitoring
Posted: Tue Feb 01, 2022 3:56 pm
by pbroste
Hello
@jameyw
Thanks for the info, want to run a 'snmpwalk' on the two servers we should see the same results.
Code: Select all
snmpwalk -v1 -c public <yourhostaddresshere> .1.3.6.1.4.1.2947.1.8.1.0
Please let us know how that looks,
Perry
Re: UPS Alarm State Monitoring
Posted: Tue Feb 01, 2022 4:21 pm
by jameyw
Here is the output:
UPS with alarm:
[root@localhost ~]#snmpwalk -v1 -c public 10.x.x.40 .1.3.6.1.4.1.2947.1.8.1.0
SNMPv2-SMI::enterprises.2947.1.8.1.0 = INTEGER: 1
UPS without alarm:
[root@localhost ~]# snmpwalk -v1 -c public 10.x.x.5 .1.3.6.1.4.1.2947.1.8.1.0
SNMPv2-SMI::enterprises.2947.1.8.1.0 = INTEGER: 0
Another UPS without alarm:
[root@localhost ~]# snmpwalk -v1 -c public 10.x.x.2 .1.3.6.1.4.1.2947.1.8.1.0
SNMPv2-SMI::enterprises.2947.1.8.1.0 = INTEGER: 0
Re: UPS Alarm State Monitoring
Posted: Wed Feb 02, 2022 4:31 pm
by pbroste
Hello
@jameyw
Thanks for following up, want to have you test using a different plugin to verify. Please download and test:
https://exchange.nagios.org/directory/P ... ck/details
Thanks,
Perry
Re: UPS Alarm State Monitoring
Posted: Wed Feb 02, 2022 5:35 pm
by jameyw
I installed and ran the check but got the following return:
/usr/local/nagios/libexec/check_ups_snmp -H 10.x.x.5 -t charge -C public -w 50 -c 20
Plugin check_ups_snmp failure - snmpget command error.
I don't know if this is related but after making changes yesterday to fix the original issue of this thread on other UPS checks, I now get get the following:
/usr/local/nagios/libexec/check_snmp -H 10.x.x.5 -o '.1.3.6.1.4.1.2947.1.8.1.0' -C 'public' -P 1 -l 'Alarms' -u 'Active' -w 1 -c 0
CRITICAL - Plugin timed out while executing system call
/usr/local/nagios/libexec/negate -s /usr/local/nagios/libexec/check_snmp -H 10.x.x.5 -o .1.3.6.1.4.1.2947.1.2.3.0 -C public -P 1 -l "Runtime Remaining" -u "Minutes" -w 45 -c 30
CRITICAL - Plugin timed out
I have an open thread in the forum here:
https://support.nagios.com/forum/viewto ... 16&t=64458 but I don't have a resolution yet.
Re: UPS Alarm State Monitoring
Posted: Wed Feb 02, 2022 6:15 pm
by pbroste
Hello
@jameyw
Appears that you were using
-c 1 in previous posts for 'check_snmp', appears that
-c 0 value is not going to work.
Code: Select all
-o .1.3.6.1.4.1.2947.1.8.1.0 -C public -P 1 -l "Alarms" -u "Active" -c 1
Regards,
Perry
Re: UPS Alarm State Monitoring
Posted: Mon Feb 07, 2022 10:41 am
by jameyw
You can close the thread. I can correctly monitor the UPS now. The timeout was caused by a typo in the device's allowed SNMP manager settings
Re: UPS Alarm State Monitoring
Posted: Mon Feb 07, 2022 4:56 pm
by pbroste
Most excellent
@jameyw thanks for providing the details and will go ahead and lock.
Thanks,
Perry