Thanks Chris...

I changed the test and assigned a unique number to each document as the prefix 
and the documents did index across the two shards. I then increased the data 
set to include documents from all 6 expected shard keys and I do see them being 
indexed across both shards. I was just lucky to have started testing with 3 
different prefixes that happened to index into the same shard. 

-Will

________________________________________
From: Chris Hostetter <hossman_luc...@fucit.org>
Sent: Monday, December 15, 2014 6:45 PM
To: solr-user@lucene.apache.org
Subject: Re: All documents indexed into the same shard despite different prefix 
in id field

: ?I have a SolrCloud cluster with two servers and I created a collection using 
two shards with this command:
        ...
: There were 230 documents in the set I indexed and there were 3 different 
prefixes (RM!, WW! and BH!) but all were routed into the same shard. Is there 
anything I can do to debug this further?

I'm not really a math expert but...

If you have N (2) shards, and a single prefix ("RM") there is a 100%
chance that that prefix will hash into 1 of those N=2 shards.

For a 2nd prefix ("WW") there is a 1/N (1/2) chance that it will hash into
the same shard as your first prefix ("RM").

Likewise, there is a 1/N (1/2) chance that any other prefix ("BH") will
hash into the same hard as your first prefix ("RM").

Which means there is a 25% (1/2 * 1/2 = 1/4) chance tha 3 randomly
selected prefixes will all hash to the same shard.

(In general, if you have N shards, and P # of unique prefixes, then the
odds that they all wind up in the same shard is going to be:
"(1/N)**(P-1)")

So i suspect you just go unlucky with the 3 prefixes you happen to try in
your small test.






-Hoss
http://www.lucidworks.com/

Reply via email to