Page 1 of 1

Unable to determine CPU load on host

Posted: Fri Jul 20, 2012 1:44 pm
by ivankgb
Running Nagios 3.2 .0 and sporadically receiving "Unable to determine CPU load on host " messages for different hosts. Verified that the SNMP traffic is good between Nagios and the hosts.
Could this be related to the Load on Nagios server iteself?
localhost;Current Load;CRITICAL;SOFT;1;CRITICAL - load average: 14.27, 12.30, 6.14

Although based on the logs it happens also when the Load on Nagios is normal.
Any ideas/hints would be much appreciated!

thank you!

ivankgb

Re: Unable to determine CPU load on host

Posted: Mon Jul 23, 2012 8:14 pm
by jsmurphy
I think you will find what you are experiencing here is probably small amounts of packet loss. SNMP uses a UDP connection, which in simple terms means "I'm going to send this data and don't particularly care if it gets there or not".

This could be because of network congestion, low QoS priority, hitting an errant ACL on a multi-pathed network (highly unlikely), exceeding the packet TTL, the receiving device being under high-load and delaying or failing to process the query. I guess what I am getting at is it's unlikely to actually be the server in this particular case unless there's a bug in the script that causes some weird behaviour or your network card is experiencing network congestion and is dropping the packet.

If you are experiencing it in small outages of less than 5 or 10 minutes once a week on a couple of checks I probably wouldn't worry about it too much, if it's more frequent then I would speak to your network team.