Re: A question about attaching shards to load balancers

Shawn Heisey Wed, 30 Jan 2013 07:22:07 -0800

On 1/30/2013 6:45 AM, Lee, Peter wrote:

Upayavira,

Thank you for your response. I'm sorry my post is perhaps not clear...I am
relatively new to solr and I'm not sure I'm using the correct nomenclature.

We did encounter the issue of one shard in the stripe going down and all other
shards continue to receive requests...and return errors because of the missing
shard. We did in fact correct this problem by making our healthcheck smart
enough to test all of the other servers in the stripe. That works very well and
was not hard at all to implement.

My intended question was one entirely about performance. Perhaps if I am more
specific it will help.

We have 6 servers per "stripe" (which means, a search request going to any one
of these servers also generates traffic on the other 5 servers in the stripe to fulfill
the request) and multiple stripes (for load and for redundancy). For this discussion
though, let's assume we have only ONE stripe.

We currently have a load balancer that points to all 6 of the servers in our stripe. That
is, requests from "outside" can be directed to any server in the stripe.

The question is: Has anyone performed empirical testing to see if perhaps
having 2 or 3 servers (instead of all 6) on the load balancer improves
performance?

In this configuration, sure, not all servers can field requests from the "outside."
However, the total amount of "conversation" going on between the different servers will
also be lower, as distributed searches can now only originate from 2 or 3 servers in the stripe
(however many we attached to the load balancer).

We can perform this testing, but it will take time, so I thought I'd ask if anyone has
done this already. I was hoping to find a mention of a "best practice"
somewhere regarding this type of question, but I have not found one yet.

I have a multi-server distributed Solr 3.5 installation behind a loadbalancer (haproxy). The application and the load balancer arecompletely unaware of the shards parameter- that's handled in Solr.Here's how I've made that work:

The core with the shards parameter (we refer to it as a broker core)exists on all servers. There are two servers for chain A and twoservers for chain B. Three of the seven shards live on idxa1/idxb1 andfour of the shards live on idxa2/idxb2. The "shards" parameter on bothchain A servers point only to chain A shards. The same goes for chain B.

The ping handler's health check query contains shards and shards.qtparameters, so the health check will fail if any of the shards for thatchain are down.

The load balancer has idxa1 and idxb1 as primary equal cost entries. Ithas idxa2 and idxb2 as backup entries, with idxa2 having the higherweight. In normal operation, queries only go to idxa1 and idxb1.

If any shard failure happens on either chain A server, both the idxa1and idxa2 entries will be marked down by the health check and querieswill only go to chain B.

I can also disable these servers from the load balancer's perspectiveusing the admin UI. If idxb1 is disabled, all queries will go to idxa1(which utilizes both idxa1 and idxa2). In that situation, if any chainA failure were to happen but the chain B shards were all still fine,idxb2 would still be marked up and the load balancer would send thequeries there.

The two index chains are independently updated - no replication. Thisallows me to disable either idxa1 or idxb1 and completely rebuild (orupgrade) the disabled chain while the other chain remains online. I canthen switch and do the same thing to the other chain, and theapplication using Solr has no idea anything has happened.


Thanks,
Shawn

Re: A question about attaching shards to load balancers

Reply via email to