Need to monitor all the linux kernel variables as possible

patricio.dorantes · Post by **patricio.dorantes** » Tue Aug 13, 2013 11:07 am

Hi,

IDK if this is the right place to ask but, fb nagios Community Manager sent me here so, here I go:

I have a complex web environment with reverse proxy, web servers, application servers, and a lot of stuff that has a lot of traffic. I use Nagios to monitor all: Databases, WebServers, Proxy Servers, OS, etc. Recently I got A LOT of traffic on one node (300GiB/day). Everything is going smoothly except that the reverse proxy from time to time is reporting the next error:

Service Warning[08-12-2013 16:05:30] SERVICE ALERT: QROPC2FEDGE06;Web Container;WARNING;SOFT;1;HTTP WARNING: HTTP/1.1 400 Proxy Error: Unable to connect to remote host "2WEB06" or host not responding - URL "http://WEB06/wps/portal/!ut/p/a1/04_Sj9 ... Kd3R09",[b] errno: 111[/b] - 3012 bytes in 3.002 second response time

This is a direct error message from iptables, I know this because I changed the reject-with icmp-host-prohibited by reject-with icmp-port-unreachable. This happens from time to time so is not a misplaced rule. I checked the net.netfilter.nf_conntrack_count vs the net.netfilter.nf_conntrack_max, and it is not exceeded.

So... Could anyone here could tell me some directions on what to monitor?

Thanks in advance!

abrist · Post by **abrist** » Tue Aug 13, 2013 11:48 am

Error 400 is usually a bad or malformed request. Do you actually want to monitor kernel variables, http requests and errors, or something else?
Kernel variables would be monitored by checking against /proc and /sys while http requests would require using tcpflow/dump and/or additional GETs.

patricio.dorantes · Post by **patricio.dorantes** » Tue Aug 13, 2013 1:53 pm

abrist wrote:Error 400 is usually a bad or malformed request. Do you actually want to monitor kernel variables, http requests and errors, or something else?
Kernel variables would be monitored by checking against /proc and /sys while http requests would require using tcpflow/dump and/or additional GETs.

Thank you for your reply. The 400 is the reply that the proxy replies on backend's error. I would like to monitor kernel variables, I found out that /proc/net/sockstat gives some information about sockets used, and memory usage. I got something like: inuse 25 orphan 8 tw 548 alloc 99 mem 31, so It seems that sockets' memory usage is not the problem.

=) Thank you

abrist · Post by **abrist** » Tue Aug 13, 2013 4:04 pm

No problem. How what type of usage does the server you are checking see on average? Error 400s can be hard to hunt down, do any other checks to this server, or any of your otehr checks in general fail with a 400 as well? Are both the nagios server and the checked host residing on the same network?

patricio.dorantes · Post by **patricio.dorantes** » Wed Aug 14, 2013 12:58 am

1

abrist wrote:No problem. How what type of usage does the server you are checking see on average? Error 400s can be hard to hunt down, do any other checks to this server, or any of your otehr checks in general fail with a 400 as well? Are both the nagios server and the checked host residing on the same network?

Hi!
Yes both servers reside on the same network, in a matter of fact I have 2 different networks 1 for monitoring a 192.168.0.x and one for data exchange between proxy server and http server. Sorry if i didn't catch up your first question very well, if I understood: The server that reports the 400 has about 2.5k concurrent users, the http server that is behind it has like 150 worker threads busy and 2350 idle... so is not a matter of avilable worker threads. The other nrpe checks doesn't fail, I measure I/O... <1% CPU <39% mem<60%, HDD<65%, Bandwith >1MiBps < 4.5MiBps and NETSTATS, 200<established<1500

It seems like everything is ok :S

abrist · Post by **abrist** » Wed Aug 14, 2013 10:23 am

Fair enough. You may want to increase the timeout on the check if it is relatively short. Sound like things are behaving normally for now. We are here to help when you have future problems. Have a good week.

Nagios Support Forum

Need to monitor all the linux kernel variables as possible

Need to monitor all the linux kernel variables as possible

Re: Need to monitor all the linux kernel variables as possib

Re: Need to monitor all the linux kernel variables as possib

Re: Need to monitor all the linux kernel variables as possib

Re: Need to monitor all the linux kernel variables as possib

Re: Need to monitor all the linux kernel variables as possib