Nagios Support Forum

Posted: **Thu May 21, 2015 12:40 pm**

Ok, now that this has been out longer and has more users I'd like to get a consensus. I talked about this before, but I want to approach the topic again.

How are people handling load balancing between all their log server nodes? We can do it via F5(which we are doing) and then users browse to that IP and our systems send logs to that IP. We then of course get the benefits of the F5 load balancer, but the big drawback is the F5 rewrites the source IP in the packet, so everything seems to be coming from it instead of the real sources.

We could use DNS Round Robin, but what happens if a host goes down then....

I've been researching clustering options for linux and most seem to be for application clusters and we don't need that for NLS as the nodes all talk to each other already. If you know a good option, point it out.

The other option would be to use the F5 just for browsing and setup the systems to send logs to either of the node IPs. The issue with that is not true load balancing and also, if a node goes down, no logs from systems sending there during the node outage.

Ideas? Inputs?

Posted: **Thu May 21, 2015 1:18 pm**

I'd like to add HAProxy into the mix - though the configuration might be difficult. Not sure about the benefits or downfalls of HAProxy versus the F5 load balancer.

One thing to keep in mind is that even though elasticsearch will load balance the data on the backend, the inbound logs are not load balanced. If you were to send a billion logs to a single instance of logstash in a short period of time, you'd certainly knock it over or encounter some issues - and also face the risk of having a single point of failure.

I am not familiar with F5 balancers - do they have a transparent load balancing option? I understand that this is trickier to implement, but it could be worth looking into if you want the whole shebang. This way, you'd retain the source IP addresses of your end hosts as well as having logstash redundancy/load balancing.

That's my two cents.

Posted: **Thu May 21, 2015 1:40 pm**

jolson wrote:I'd like to add HAProxy into the mix - though the configuration might be difficult. Not sure about the benefits or downfalls of HAProxy versus the F5 load balancer.

One thing to keep in mind is that even though elasticsearch will load balance the data on the backend, the inbound logs are not load balanced. If you were to send a billion logs to a single instance of logstash in a short period of time, you'd certainly knock it over or encounter some issues - and also face the risk of having a single point of failure.

I am not familiar with F5 balancers - do they have a transparent load balancing option? I understand that this is trickier to implement, but it could be worth looking into if you want the whole shebang. This way, you'd retain the source IP addresses of your end hosts as well as having logstash redundancy/load balancing.

That's my two cents.

Yes, there is transparent option, but if we did that(enabled the feature) we'd have to change a million other things we've done for customers with the F5.

Posted: **Thu May 21, 2015 4:24 pm**

I assume it's not possible to disable source-IP re-writing based on destination addresses? For instance, if destination address were to equal one of the NLS nodes, don't re-write the source IP. Of course this is assuming that all traffic will flow through the F5 anyway.

I've been looking through options, and can't find one that absolutely shines above the rest. I typically like to recommend round-robin-dns, as it's easy to setup and requires little maintenance - but you do risk losing log information/web GUI connectivity if a node goes down. You could reduce the TTL to 60 seconds, but there's no guarantee that the DNS provider would respect that - plus client would likely cache for longer than 60 seconds.

I also understand that Logstash is working on a high availability mechanism, so there is hope for future improvement in this area: https://github.com/elastic/logstash/issues/2633

Anyone else have thoughts here?

Posted: **Tue May 26, 2015 7:59 am**

Some thoughts:

Code: Select all

Do you have to use the same proxy that you're using for customers?  
Different IP for browsing than for sending logs (virtual interface if needed)
Light(er) weight Squid proxy just for NLS

Posted: **Tue May 26, 2015 8:07 am**

I was thinking about using HAProxy or something that I can install on the nodes and use a virtual IP between them. You have something that's simple to install as I have never installed any

Posted: **Tue May 26, 2015 8:09 am**

I'd look into Squid.

Posted: **Tue May 26, 2015 1:42 pm**

we are testing out HA. We have set op a VIP on the netscaler. I am setting up my devices to send to the vip. I am rebuilding one of my NLS boxes. By browsing to the vip, i can get to my 2nd node.
I haven't configured any devices to send to the vip yet .

Posted: **Wed May 27, 2015 8:40 am**

Dangit, most of this topic is useless now as I found out its not the F5 changing the source IP but the router that does the NAT between our data centers. So my question now, any harm in the nodes being far away from each other? right now both are in Chicago. can I move the one to San Francisco so items there can just send directly to that node and items in Chicago just send directly to the Chicago node. I was planning on sending logs from devices from SF to Chicago, so syncing the two servers shouldn't be anymore load, right? Anyone see an issue with that?

Posted: **Wed May 27, 2015 9:50 am**

This is something that you shouldn't have problems doing, but keep in mind that some of your settings will need to be more fine-tuned than your standard deployment since your nodes are so far apart.

Take a look at the following:
https://www.elastic.co/guide/en/elastic ... ster_nodes
Since your nodes will be further apart, it's more likely that there will be a connectivity issue. The minimum_master_nodes setting is very important here - it should be set to a quorum of your total amount of nodes.

You might also take a look at the fault detection settings and set them appropriately: https://www.elastic.co/guide/en/elastic ... -detection

Since shards are migrated dynamically between nodes, your cluster will take up more bandwidth as your amount of log data grows - but that's a given.

Besides the above, you'll need to do some port forwarding - port 9300-9400 (selected dynamically) are important for node-to-node communication.

Nagios Support Forum

Cluster IP or Load balance or ?

Cluster IP or Load balance or ?

Re: Cluster IP or Load balance or ?

Re: Cluster IP or Load balance or ?

Re: Cluster IP or Load balance or ?

Re: Cluster IP or Load balance or ?

Re: Cluster IP or Load balance or ?

Re: Cluster IP or Load balance or ?

Re: Cluster IP or Load balance or ?

Re: Cluster IP or Load balance or ?

Re: Cluster IP or Load balance or ?