Re: Solr document routing using composite key

2018-04-07 Thread Nawab Zada Asad Iqbal
Thanks Shawn and Erick. This is what I also ended up finding, as the number of buckets increased, I noticed the issue. Zheng: I am using Solr7. But this was only an experiment on the hash, i.e., what distribution should I expect from it. (as the above gist shows). I didn't actually index into sol

Re: Solr document routing using composite key

2018-03-16 Thread Erick Erickson
What Shawn said. 117 shards and 116 docs tells you absolutely nothing useful. I've never seen the number of docs on various shards be off by more than 2-3% when enough docs are indexed to be statistically valid. Best, Erick On Fri, Mar 16, 2018 at 5:34 AM, Shawn Heisey wrote: > On 3/6/2018 11:53

Re: Solr document routing using composite key

2018-03-16 Thread Shawn Heisey
On 3/6/2018 11:53 AM, Nawab Zada Asad Iqbal wrote: I have 117 shards and i tried to use document ids from zero to 116. I find that the distribution is very uneven, e.g., the largest bucket receives total 5 documents; and around 38 shards will be empty. Is it expected? With such a small data se

Re: Solr document routing using composite key

2018-03-15 Thread Zheng Lin Edwin Yeo
Hi, What version of Solr are you running? How did you configure your shards in Solr? Regards, Edwin On 7 March 2018 at 02:53, Nawab Zada Asad Iqbal wrote: > Hi solr community: > > > I have been thinking to use composite key for my next project iteration and > tried it today to see how it distr

Solr document routing using composite key

2018-03-06 Thread Nawab Zada Asad Iqbal
Hi solr community: I have been thinking to use composite key for my next project iteration and tried it today to see how it distributes the documents. Here is a gist of my code: https://gist.github.com/niqbal/3e293e2bcb800d6912a250d914c9d478 I have 117 shards and i tried to use document ids fro

Re: Solr Document Routing

2017-06-01 Thread Erick Erickson
Can you check if those IDs are on shard8? You can do this by pointing the URL at the core and specifying &distrib=false... Best, Erick On Thu, Jun 1, 2017 at 1:42 AM, Amrit Sarkar wrote: > Sorry, The confluence link: > https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+

Re: Solr Document Routing

2017-06-01 Thread Amrit Sarkar
Sorry, The confluence link: https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud Amrit Sarkar Search Engineer Lucidworks, Inc. 415-589-9269 www.lucidworks.com Twitter http://twitter.com/lucidworks LinkedIn: https://www.linkedin.com/in/sarkaramrit2 On Thu, Jun 1,

Re: Solr Document Routing

2017-06-01 Thread Amrit Sarkar
Sathyam, It seems your interpretation is wrong as CloudSolrClient calculates (hashes the document id and determine the range it belongs to) which shard the document incoming belongs to. As you have 10 shards, the document will belong to one of them, that is what being calculated and eventually pus

Solr Document Routing

2017-06-01 Thread Sathyam
HI, I am indexing documents to a 10 shard collection (testcollection, having no replicas) in solr6 cluster using CloudSolrClient. I saw that there is a lot of peer to peer document distribution going on when I looked at the solr logs. An example log statement is as follows: 2017-06-01 06:07:28.37