On 1/30/2013 6:45 AM, Lee, Peter wrote:
Upayavira,

Thank you for your response. I'm sorry my post is perhaps not clear...I am 
relatively new to solr and I'm not sure I'm using the correct nomenclature.

We did encounter the issue of one shard in the stripe going down and all other 
shards continue to receive requests...and return errors because of the missing 
shard. We did in fact correct this problem by making our healthcheck smart 
enough to test all of the other servers in the stripe. That works very well and 
was not hard at all to implement.

My intended question was one entirely about performance.  Perhaps if I am more 
specific it will help.

We have 6 servers per "stripe" (which means, a search request going to any one 
of these servers also generates traffic on the other 5 servers in the stripe to fulfill 
the request) and multiple stripes (for load and for redundancy). For this discussion 
though, let's assume we have only ONE stripe.

We currently have a load balancer that points to all 6 of the servers in our stripe. That 
is, requests from "outside" can be directed to any server in the stripe.

The question is: Has anyone performed empirical testing to see if perhaps 
having 2 or 3 servers (instead of all 6) on the load balancer improves 
performance?

In this configuration, sure, not all servers can field requests from the "outside." 
However, the total amount of "conversation" going on between the different servers will 
also be lower, as distributed searches can now only originate from 2 or 3 servers in the stripe 
(however many we attached to the load balancer).

We can perform this testing, but it will take time, so I thought I'd ask if anyone has 
done this already. I was hoping to find a mention of a "best practice" 
somewhere regarding this type of question, but I have not found one yet.

I have a multi-server distributed Solr 3.5 installation behind a load balancer (haproxy). The application and the load balancer are completely unaware of the shards parameter- that's handled in Solr. Here's how I've made that work:

The core with the shards parameter (we refer to it as a broker core) exists on all servers. There are two servers for chain A and two servers for chain B. Three of the seven shards live on idxa1/idxb1 and four of the shards live on idxa2/idxb2. The "shards" parameter on both chain A servers point only to chain A shards. The same goes for chain B.

The ping handler's health check query contains shards and shards.qt parameters, so the health check will fail if any of the shards for that chain are down.

The load balancer has idxa1 and idxb1 as primary equal cost entries. It has idxa2 and idxb2 as backup entries, with idxa2 having the higher weight. In normal operation, queries only go to idxa1 and idxb1.

If any shard failure happens on either chain A server, both the idxa1 and idxa2 entries will be marked down by the health check and queries will only go to chain B.

I can also disable these servers from the load balancer's perspective using the admin UI. If idxb1 is disabled, all queries will go to idxa1 (which utilizes both idxa1 and idxa2). In that situation, if any chain A failure were to happen but the chain B shards were all still fine, idxb2 would still be marked up and the load balancer would send the queries there.

The two index chains are independently updated - no replication. This allows me to disable either idxa1 or idxb1 and completely rebuild (or upgrade) the disabled chain while the other chain remains online. I can then switch and do the same thing to the other chain, and the application using Solr has no idea anything has happened.

Thanks,
Shawn

Reply via email to