ERROR: Alarm signal (Nagios time-out)
Posted: Wed Mar 08, 2017 10:04 am
Yesterday morning I migrated my Nagios installation from a VM on VMWare Workstation to a physical machine, and since doing so several of the remote servers I monitor have been intermittently giving the error "ERROR: Alarm signal (Nagios time-out)" when using check_snmp_win.pl to check the status of a Windows service. I am not sure if moving from VM to physical is the cause of the issue, but it seems a coincidence so thought it worth mentioning. Nagios is running on Centos using the same hardware as previously (albeit a different hard drive.)
If I run the plugin directly from the command line, I have no issue and it produces a result fairly quickly (3 or 4 seconds.) This is what I am running:
./check_snmp_win.pl -H <<IP Address>> -n "IIS Admin Service" -C public -r -t 60
I added the timeout as when Googling the error it seems this can help. I also set the Nagios service timeout to longer to see if this helped.
The server is using the same NIC and connection as previously, so I am not sure if it is due to moving from virtual to physical, or something else. I updated my check_snmp_win.pl script as well.
One other thing I have tried is to use the --v2c flag to use SNMP v2. The odd thing is that sometimes this gets a correct result quicker (when using command line), and sometimes it fails. This is the only time I can get "ERROR: Alarm signal (Nagios time-out)" to show up from command line. However, if I remove the flag, it will always work.
Here is an output of commands run one after the other:
So SNMP v1 always works on command line (when I have tested), v2 either works quicker or gives the error, and both can give the error through Nagios.
I am running out of ideas of what could be the cause, and what the solution may be. Any help would be much appreciated, I have only been using Nagios for a week or so, so I am pretty new to it.
If I run the plugin directly from the command line, I have no issue and it produces a result fairly quickly (3 or 4 seconds.) This is what I am running:
./check_snmp_win.pl -H <<IP Address>> -n "IIS Admin Service" -C public -r -t 60
I added the timeout as when Googling the error it seems this can help. I also set the Nagios service timeout to longer to see if this helped.
The server is using the same NIC and connection as previously, so I am not sure if it is due to moving from virtual to physical, or something else. I updated my check_snmp_win.pl script as well.
One other thing I have tried is to use the --v2c flag to use SNMP v2. The odd thing is that sometimes this gets a correct result quicker (when using command line), and sometimes it fails. This is the only time I can get "ERROR: Alarm signal (Nagios time-out)" to show up from command line. However, if I remove the flag, it will always work.
Here is an output of commands run one after the other:
Code: Select all
[root@nagios plugins]# ./check_snmp_win.pl -H <<removed>> -n "IIS Admin Service" -C public --v2c -r -t 60
ERROR: Alarm signal (Nagios time-out)
[root@nagios plugins]# ./check_snmp_win.pl -H <<removed>> -n "IIS Admin Service" -C public -r -t 60
1 services active (named "IIS Admin Service") : OK
[root@nagios plugins]# ./check_snmp_win.pl -H <<removed>> -n "IIS Admin Service" -C public --v2c -r -t 60
ERROR: Alarm signal (Nagios time-out)
[root@nagios plugins]# ./check_snmp_win.pl -H <<removed>> -n "IIS Admin Service" -C public -r -t 60
1 services active (named "IIS Admin Service") : OK
I am running out of ideas of what could be the cause, and what the solution may be. Any help would be much appreciated, I have only been using Nagios for a week or so, so I am pretty new to it.