Page 1 of 1

UPS Alarm State Monitoring

Posted: Mon Jan 31, 2022 2:12 pm
by jameyw
I'm trying to monitor the alarm state of all of my UPS systems. Doing a SNMP walk, I find an OID that shows 0 (zero) for no alarms or 1 (one) if there are active alarms. I can't seem to figure out how to get NagiosXI to alert when alarms are present.

Here is what I have:

Check Command: check_xi_service_snmp

Command View: $USER1$/check_snmp -H $HOSTADDRESS$ $ARG1$

$ARG1$

Code: Select all

 -o .1.3.6.1.4.1.2947.1.8.1.0 -C public -P 1 -l "Alarms" -u "Active" -c 1
I have one UPS with an active alarm and NagiosXI is reading a 1 for alarms present but it still shows "OK". I tried check_xi_service_snmp_negate and I get an alert but I also get an alert even if the alarm status is 0.

I feel like this should be simple.

Re: UPS Alarm State Monitoring

Posted: Mon Jan 31, 2022 6:11 pm
by pbroste
Hello @jameyw

Thanks for reaching out, want to have you run through the command and let us know the results:

Code: Select all

/usr/local/nagios/libexec/check_snmp -H yourhostaddresshere -o .1.3.6.1.4.1.2947.1.8.1.0 -C public -P 1 -l "Alarms" -u "Active" -c 1-4 -s --verbose
adding:
-s, --string=STRING
Return OK state (for that OID) if STRING is an exact match
--verbose to show more details
Please collect the following logs and PM them over:

Code: Select all

tar -czvf /tmp/logresults.txt /usr/local/nagiosxi/var/* /usr/local/nagios/var/nagios.log
Thanks,
Perry

Re: UPS Alarm State Monitoring

Posted: Tue Feb 01, 2022 11:01 am
by jameyw
After looking at the commands you suggested and executing them, I didn't quite get the desired results but with a little tweaking I think I did.

I executed on a UPS with no alarms:

Code: Select all

/usr/local/nagios/libexec/check_snmp -H 10.x.x.5 -o .1.3.6.1.4.1.2947.1.8.1.0 -C public -P 1 -l "Alarms" -u "Active" -c 0 -s --verbose
and it returned: SNMP OK - Alarms 0 Active | Alarms=0Active;;0;

I then tried it on the UPS with alarms active:

Code: Select all

/usr/local/nagios/libexec/check_snmp -H 10.x.x.41 -o .1.3.6.1.4.1.2947.1.8.1.0 -C public -P 1 -l "Alarms" -u "Active" -c 0 -s --verbose
and it returned: SNMP CRITICAL - Alarms *1* Active | Alarms=1Active;;0;

I went into CCM and and changed $ARG1$ to:

Code: Select all

 -o .1.3.6.1.4.1.2947.1.8.1.0 -C public -P 1 -l "Alarms" -u "Active" -c 0
and I see the results UPS alarms status of the units as expected.
UPS alarm1.JPG
UPS alarm2.JPG

Re: UPS Alarm State Monitoring

Posted: Tue Feb 01, 2022 3:56 pm
by pbroste
Hello @jameyw

Thanks for the info, want to run a 'snmpwalk' on the two servers we should see the same results.

Code: Select all

snmpwalk -v1 -c public <yourhostaddresshere> .1.3.6.1.4.1.2947.1.8.1.0
Please let us know how that looks,
Perry

Re: UPS Alarm State Monitoring

Posted: Tue Feb 01, 2022 4:21 pm
by jameyw
Here is the output:

UPS with alarm:
[root@localhost ~]#snmpwalk -v1 -c public 10.x.x.40 .1.3.6.1.4.1.2947.1.8.1.0
SNMPv2-SMI::enterprises.2947.1.8.1.0 = INTEGER: 1

UPS without alarm:
[root@localhost ~]# snmpwalk -v1 -c public 10.x.x.5 .1.3.6.1.4.1.2947.1.8.1.0
SNMPv2-SMI::enterprises.2947.1.8.1.0 = INTEGER: 0

Another UPS without alarm:
[root@localhost ~]# snmpwalk -v1 -c public 10.x.x.2 .1.3.6.1.4.1.2947.1.8.1.0
SNMPv2-SMI::enterprises.2947.1.8.1.0 = INTEGER: 0

Re: UPS Alarm State Monitoring

Posted: Wed Feb 02, 2022 4:31 pm
by pbroste
Hello @jameyw

Thanks for following up, want to have you test using a different plugin to verify. Please download and test:

https://exchange.nagios.org/directory/P ... ck/details

Thanks,
Perry

Re: UPS Alarm State Monitoring

Posted: Wed Feb 02, 2022 5:35 pm
by jameyw
I installed and ran the check but got the following return:

/usr/local/nagios/libexec/check_ups_snmp -H 10.x.x.5 -t charge -C public -w 50 -c 20
Plugin check_ups_snmp failure - snmpget command error.

I don't know if this is related but after making changes yesterday to fix the original issue of this thread on other UPS checks, I now get get the following:
/usr/local/nagios/libexec/check_snmp -H 10.x.x.5 -o '.1.3.6.1.4.1.2947.1.8.1.0' -C 'public' -P 1 -l 'Alarms' -u 'Active' -w 1 -c 0
CRITICAL - Plugin timed out while executing system call

/usr/local/nagios/libexec/negate -s /usr/local/nagios/libexec/check_snmp -H 10.x.x.5 -o .1.3.6.1.4.1.2947.1.2.3.0 -C public -P 1 -l "Runtime Remaining" -u "Minutes" -w 45 -c 30
CRITICAL - Plugin timed out

I have an open thread in the forum here: https://support.nagios.com/forum/viewto ... 16&t=64458 but I don't have a resolution yet.

Re: UPS Alarm State Monitoring

Posted: Wed Feb 02, 2022 6:15 pm
by pbroste
Hello @jameyw

Appears that you were using -c 1 in previous posts for 'check_snmp', appears that -c 0 value is not going to work.

Code: Select all

-o .1.3.6.1.4.1.2947.1.8.1.0 -C public -P 1 -l "Alarms" -u "Active" -c 1
Regards,
Perry

Re: UPS Alarm State Monitoring

Posted: Mon Feb 07, 2022 10:41 am
by jameyw
You can close the thread. I can correctly monitor the UPS now. The timeout was caused by a typo in the device's allowed SNMP manager settings

Re: UPS Alarm State Monitoring

Posted: Mon Feb 07, 2022 4:56 pm
by pbroste
Most excellent @jameyw thanks for providing the details and will go ahead and lock.

Thanks,
Perry