Need to monitor all the linux kernel variables as possible
Posted: Tue Aug 13, 2013 11:07 am
Hi,
IDK if this is the right place to ask but, fb nagios Community Manager sent me here so, here I go:
I have a complex web environment with reverse proxy, web servers, application servers, and a lot of stuff that has a lot of traffic. I use Nagios to monitor all: Databases, WebServers, Proxy Servers, OS, etc. Recently I got A LOT of traffic on one node (300GiB/day). Everything is going smoothly except that the reverse proxy from time to time is reporting the next error:
Service Warning[08-12-2013 16:05:30] SERVICE ALERT: QROPC2FEDGE06;Web Container;WARNING;SOFT;1;HTTP WARNING: HTTP/1.1 400 Proxy Error: Unable to connect to remote host "2WEB06" or host not responding - URL "http://WEB06/wps/portal/!ut/p/a1/04_Sj9 ... Kd3R09",[b] errno: 111[/b] - 3012 bytes in 3.002 second response time
This is a direct error message from iptables, I know this because I changed the reject-with icmp-host-prohibited by reject-with icmp-port-unreachable. This happens from time to time so is not a misplaced rule. I checked the net.netfilter.nf_conntrack_count vs the net.netfilter.nf_conntrack_max, and it is not exceeded.
So... Could anyone here could tell me some directions on what to monitor?
Thanks in advance!
IDK if this is the right place to ask but, fb nagios Community Manager sent me here so, here I go:
I have a complex web environment with reverse proxy, web servers, application servers, and a lot of stuff that has a lot of traffic. I use Nagios to monitor all: Databases, WebServers, Proxy Servers, OS, etc. Recently I got A LOT of traffic on one node (300GiB/day). Everything is going smoothly except that the reverse proxy from time to time is reporting the next error:
Service Warning[08-12-2013 16:05:30] SERVICE ALERT: QROPC2FEDGE06;Web Container;WARNING;SOFT;1;HTTP WARNING: HTTP/1.1 400 Proxy Error: Unable to connect to remote host "2WEB06" or host not responding - URL "http://WEB06/wps/portal/!ut/p/a1/04_Sj9 ... Kd3R09",[b] errno: 111[/b] - 3012 bytes in 3.002 second response time
This is a direct error message from iptables, I know this because I changed the reject-with icmp-host-prohibited by reject-with icmp-port-unreachable. This happens from time to time so is not a misplaced rule. I checked the net.netfilter.nf_conntrack_count vs the net.netfilter.nf_conntrack_max, and it is not exceeded.
So... Could anyone here could tell me some directions on what to monitor?
Thanks in advance!