How do I diagnose 'socket timeout' issues?

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: How do I diagnose 'socket timeout' issues?

Post by scottwilkerson »

globalive.nagios wrote: One other note...I don't see any notifications sent out in the log (maybe just a config issue...?).
Possibly, the default for notification on a potential problem is to recheck every minute for 5 times before sending a notification. Of course your setup may not be using the defaults
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
globalive.nagios

Re: How do I diagnose 'socket timeout' issues?

Post by globalive.nagios »

Ah, yeah, that would make sense. Thanks for clearing that part up.

Any ideas on the rest of it?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: How do I diagnose 'socket timeout' issues?

Post by scottwilkerson »

globalive.nagios wrote:Okie dokie, I'll look more closely into DNS issues next time something comes up, and talk to the department about any peculiarities of their network. That should go over well. :D
If this only happens to one department, I would have that talk with them about their network.

It could be a malfunctioning router,switch, hub, etc. The error you describe could be bad packet loss on a switch for the department which would explain why it only happens to that department.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
globalive.nagios

Re: How do I diagnose 'socket timeout' issues?

Post by globalive.nagios »

Unfortunately the two hosts that come up with these errors most often are on different subnets, although both are VMs.

What I find most perplexing is that only one service will report a timeout - the others on the host are reporting fine. What would explain that?
globalive.nagios

Re: How do I diagnose 'socket timeout' issues?

Post by globalive.nagios »

Haha, okay, I think we have a solution. Once I clued in that these were both VMs, I checked out the VM performance, and sure enough, memory balloon was 5-10%!! (for those not in the know, memory balloon = paging to disk, and is REALLY bad for performance)

At this point I think we can consider this a non-issue until we fix that issue. At least now you have another option for your troubleshooting list!


"Question 5b: Does your VM performance suck?"
Locked