On 12/9/2010 12:56 AM, patrick wrote:
i'm considering of using more than 3 solr shards and assign a (separate) proxy to do the loadbalancing when indexing. using SolrJ is my way to do the indexing. the question is if i get any information about the whereabouts of the shard in which the document is stored. this information would be helpful in case a specific shard has to be re-indexed (no indexing downtime, isolated recovery). i assume the HTTP-response only contains the IP address of the proxy.

If you end up with a truly randomized shard selection at index time, I don't think there's any way to figure out what shard it came from. I'd love to be proven wrong.

In my case, I run my numeric document ID through "mod 6" to put it in shards numbered 0 through 5. I use load balancers, but only for searching. I normally do not need to know what shard a document came from, but I can easily calculate it at any time.

Shawn

Reply via email to