No buffer space available
Posted: Thu May 03, 2012 8:07 am
I just deployed a new server upgrade a nagios 2.x system to nagios 3.x. It's monitoring 1300 hosts and about 3400 services. The old operating system is AIX, but I'm replacing it with Linux. Checks are scheduled for every 5 minutes and once it gets rolling I'm getting random services flapping, typically about 150 are in a flapping state at the same time, but its completely random.
The ones that are flapping are going to unknown, unable to determine plugin output. I've verified from the command line that I'm seeing the same thing
I was thinking the system is running out of available TCP connections, but looking at netstat and the files in /proc/sys/net/core the number of connections look relatively low. I'm not real familiar with this area of Linux so I certainly could be missing something.
My first thought was memory, but it seems fine
Anyone ever seen this?
The ones that are flapping are going to unknown, unable to determine plugin output. I've verified from the command line that I'm seeing the same thing
Code: Select all
/bin/ping -n -U -w 30 -c 5 xxx.yyy.zzz
connect: No buffer space available
/bin/ping -n -U -w 30 -c 5 xxx.yyy.zzz
PING xxx.yyy.zzz (x.x.x.x) 56(84) bytes of data.
64 bytes from x.x.x.x: icmp_seq=1 ttl=254 time=0.456 ms
64 bytes from x.x.x.x: icmp_seq=2 ttl=254 time=0.398 ms
My first thought was memory, but it seems fine
Code: Select all
Mem: 4043832k total, 3048080k used, 995752k free, 161512k buffers
Swap: 6094840k total, 76k used, 6094764k free, 482860k cached