Just to be completely clear: the program that splits your index in 20 shards
should employ this algo as well.


2010/8/9 Geert-Jan Brits <gbr...@gmail.com>

> I'm not sure if Solr has some build-in support for sharding-functions, but
> you should generally use some hashing-algorithm to split the indices and use
> the same hash-algorithm to locate which shard contains a document.
> http://en.wikipedia.org/wiki/Hash_function
>
> Without employing any domain knowledge (of documents you possible want to
> group toegether on a single shard for performance) you could build a very
> simple (crude) hash-function by md5-hashing the unique-keys of your
> documents, taking the first 3 chars (should be precise enough, so load is
> pretty much balanced), calculate a nr from the chars (256 * first char + 16
> * 2nd char + 3rd char), and take that nr modulo 20. That should give you a
> nr in [0,20) which is the shard-index.
>
> use the same algorithm to determine which shard contains the document that
> you want to change.
>
> Geert-Jan
>
>
> 2010/8/9 lu.rongbin <lu.rong...@goodhope.net>
>
>
>>    My index has 76 million documents, I split it to 20 indexs because the
>> size of index is 33G. I deploy 20 shards for search response performence
>> on
>> ec2's 20 instances.But when i wan't to update some doc, it means i must
>> traversal each index , and find the document is in which shard index, and
>> update the doc? It's crazy! How can i do?
>>    thanks.
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/How-do-i-update-some-document-when-i-use-sharding-indexs-tp1053509p1053509.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>
>

Reply via email to