Page 1 of 2

ERROR: General time-out (Alarm signal) ?

Posted: Thu Dec 13, 2018 10:08 am
by xpertech
There're a lot of monitors on our NagiosXI including localize sites and overseas sites, some times the "status" will display "unknown" and the "status information" will display "General time-out (Alarm signal)", what could be the problem and how to fix that?

Re: ERROR: General time-out (Alarm signal) ?

Posted: Thu Dec 13, 2018 2:35 pm
by benjaminsmith
Hi @xpertech,

Since this is an intermittent issue, it's likely that problem is related to network connectivity. The Nagios XI server is losing its connection to the remote host, and those checks are generating a timeout error.

When this is a happening, can you ping one of those hosts from the Nagios Server to verify the connection?

Re: ERROR: General time-out (Alarm signal) ?

Posted: Thu Dec 13, 2018 6:36 pm
by xpertech
But when error happened, the PING is normal?!

Re: ERROR: General time-out (Alarm signal) ?

Posted: Fri Dec 14, 2018 10:55 am
by benjaminsmith
Hi @xpertech,

Ok. What check command are you using for those services? The reason I ask, is that if you are using SNMP to check those services,we've seen this type of behavior when there is UDP packet loss over the network.

Can you post the check command or PM your system profile?

You can try to increase the timeout value for the plugin, however, if your experiencing UDP packet loss due to network issues, you'll most likely still get the same error message.

Edit the SNMP commands and add the following to the command line

Code: Select all

-t 60
Nagios XI - How To Test Check Commands From The Command-line
https://support.nagios.com/kb/article/n ... e-167.html

Re: ERROR: General time-out (Alarm signal) ?

Posted: Wed Dec 19, 2018 4:47 am
by xpertech
It seems that doesn't change much when increase timeout value, but when change SNMP version from v2 to v1, the amount of "unknown" status became lower but still some of that, I wonder why the SNMP version will affect the amount?!

Re: ERROR: General time-out (Alarm signal) ?

Posted: Wed Dec 19, 2018 10:59 am
by benjaminsmith
Hi @xpertech,

Try running the checks for those hosts from the commmand line while increasing the timeout settings to see how long it's taking. It looks like increasing the timeouts may resolve the issue.

Once you determine how long it's taking you can increase the default time out settings (below) in /usr/local/nagios/etc/nagios.cfg:

Code: Select all

host_check_timeout=30
service_check_timeout=60
What's with the different SNMP versions? v1, v2c, v3
https://www.logicmonitor.com/blog/whats ... s1-v2c-v3/

Re: ERROR: General time-out (Alarm signal) ?

Posted: Thu Dec 27, 2018 11:26 am
by xpertech
There still some Unknown status after increase timeout value,
Is there a way to make a test to verify that the Unknown status were caused by network issue?

Re: ERROR: General time-out (Alarm signal) ?

Posted: Thu Dec 27, 2018 12:36 pm
by ssax
One of the things that I've noticed is that the SNMP daemon on devices seems to be the lowest priority in terms of process priority so if you system is overloaded it may take a lot longer to poll the SNMP data (or it might not respond at all because it's too overloaded).

You can see that increasing the timeout fixes some of them which means that likely increasing the timeout may fix even more of them.

This could be because of the load on the remote device OR it could be a network issue. More than likely it's just the remote machine not responding in time before the check timeout occurs, that's the reason for the timeout error.

Are you monitoring the other metrics on those systems such as CPU/memory? Is the resource usage high on those machines when this occurs?

Generally what I do when I'm troubleshooting this is to manually run the check commands from the command line to see how long they take or run an snmpwalk against it during that time and most of the time the remote device will be performing poorly when running the commands during that time. Do you see anything in the logs on the remote devices that indicate an issue?

Code: Select all

snmpwalk -v 2c -c public X.X.X.X:161

Re: ERROR: General time-out (Alarm signal) ?

Posted: Fri Dec 28, 2018 6:24 pm
by xpertech
None of the local hosts respond Unknown status, there were overseas(American & Europe) hosts respond Unknown status, so it seems not the NagiosXI loading or performance issue.

When GUI got Unknown status, we use snmpkwalk command and there were all normal, but in the same time the GUI check disk & RAM display that host-A normal but host-B Unknown, also the check CPU display host-A Unknown and host-B normal, so it seems that not the network issue?! we also check the loading of host-A and B, they seem not in heavy duty.

Re: ERROR: General time-out (Alarm signal) ?

Posted: Fri Dec 28, 2018 6:25 pm
by xpertech
attachment