I'm not sure if Solr has some build-in support for sharding-functions, but
you should generally use some hashing-algorithm to split the indices and use
the same hash-algorithm to locate which shard contains a document.
http://en.wikipedia.org/wiki/Hash_function

Without employing any domain knowledge (of documents you possible want to
group toegether on a single shard for performance) you could build a very
simple (crude) hash-function by md5-hashing the unique-keys of your
documents, taking the first 3 chars (should be precise enough, so load is
pretty much balanced), calculate a nr from the chars (256 * first char + 16
* 2nd char + 3rd char), and take that nr modulo 20. That should give you a
nr in [0,20) which is the shard-index.

use the same algorithm to determine which shard contains the document that
you want to change.

Geert-Jan


2010/8/9 lu.rongbin <lu.rong...@goodhope.net>

>
>    My index has 76 million documents, I split it to 20 indexs because the
> size of index is 33G. I deploy 20 shards for search response performence on
> ec2's 20 instances.But when i wan't to update some doc, it means i must
> traversal each index , and find the document is in which shard index, and
> update the doc? It's crazy! How can i do?
>    thanks.
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/How-do-i-update-some-document-when-i-use-sharding-indexs-tp1053509p1053509.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Reply via email to