Page 1 of 1

Dell ESXi Host Hardware Status

PostPosted: Wed Nov 07, 2018 1:13 pm
by parweez
I am seeing this error on 2 out of 6 of my hosts which are dell poweredge m620s. Does anyone have a clue on to why this is happening?

Re: Dell ESXi Host Harware Status

PostPosted: Wed Nov 07, 2018 5:00 pm
by npolovenko
@parweez, Is this an snmp based check? These timeouts could indicate either a very slow response from the server or some kind of networking issue. I'd start by increasing the timeout value for this check from 60 to 120 seconds and see if the errors go away. You'd also need to increase the timeout in the /usr/local/nagios/etc/nagios.cfg.

Re: Dell ESXi Host Harware Status

PostPosted: Thu Nov 08, 2018 9:49 am
by parweez
It is using snmp. If it's a network issue how come the plugin works most of the time but doesn't work in very small intervals?

Re: Dell ESXi Host Harware Status

PostPosted: Thu Nov 08, 2018 11:28 am
by npolovenko
@parweez, Sometimes flooded network causes issues where communication between two servers will come in and out. Or the monitored server might be getting overloaded and stops responding to SNMP queries. I'd try to turn on the SNMP debugging on your Dell server and look for any errors.

Re: Dell ESXi Host Harware Status

PostPosted: Thu Nov 08, 2018 12:05 pm
by parweez
Do you know how to do snmp debugging on dell poweredge m620s?

Re: Dell ESXi Host Harware Status

PostPosted: Thu Nov 08, 2018 12:57 pm
by npolovenko
@parweez, I don't have a dell poweredge server in the lab. But looks like Dell Troubleshooting Tool might help with SNMP debugging. I might be wrong. The only other way is to check in the owners manual:
https://www.dell.com/support/article/us/en/04/sln311059/dell-troubleshooting-tool?lang=en

Re: Dell ESXi Host Harware Status

PostPosted: Thu Nov 08, 2018 2:43 pm
by parweez
The Dell troubleshooting tool just tests snmp. I used it and it worked on all of the hosts. I also looked at the manual and it didn't have any information on snmp.

Re: Dell ESXi Host Hardware Status

PostPosted: Thu Nov 08, 2018 4:52 pm
by parweez
Do you have any other ideas?

Re: Dell ESXi Host Hardware Status

PostPosted: Thu Nov 08, 2018 5:12 pm
by npolovenko
@parweez, Yes, let's try increasing the plugin timeout to 120 seconds. You'd also need to increase the service check timeout in the /usr/local/nagios/etc/nagios.cfg file. Please show me the service definition + command for this check.
PS: I couldn't find the information on how to enable SNMP logging either. If increasing the timeout on the plugin's won't work I recommend contacting the manufacturer with this question.