On 2010-09-06 16:41, Yonik Seeley wrote:
On Mon, Sep 6, 2010 at 10:18 AM, MitchK<mitc...@web.de>  wrote:
[...consistent hashing...]
But it doesn't solve the problem at all, correct me if I am wrong, but: If
you add a new server, let's call him IP3-1, and IP3-1 is nearer to the
current ressource X, than doc x will be indexed at IP3-1 - even if IP2-1
holds the older version.
Am I right?

Right.  You still need code to handle migration.

Consistent hashing is a way for everyone to be able to agree on the
mapping, and for the mapping to change incrementally.  i.e. you add a
node and it only changes the docid->node mapping of a limited percent
of the mappings, rather than changing the mappings of potentially
everything, as a simple MOD would do.

Another strategy to avoid excessive reindexing is to keep splitting the largest shards, and then your mapping becomes a regular MOD plus a list of these additional splits. Really, there's an infinite number of ways you could implement this...


For SolrCloud, I don't think we'll end up using consistent hashing -
we don't need it (although some of the concepts may still be useful).

I imagine there could be situations where a simple MOD won't do ;) so I think it would be good to hide this strategy behind an interface/abstract class. It costs nothing, and gives you flexibility in how you implement this mapping.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to