Sathyam, It seems your interpretation is wrong as CloudSolrClient calculates (hashes the document id and determine the range it belongs to) which shard the document incoming belongs to. As you have 10 shards, the document will belong to one of them, that is what being calculated and eventually pushed to the leader of that shard.
The confluence link provides the insights in much detail: https://lucidworks.com/2013/06/13/solr-cloud-document-routing/ Another useful link: https://lucidworks.com/2013/06/13/solr-cloud-document-routing/ Amrit Sarkar Search Engineer Lucidworks, Inc. 415-589-9269 www.lucidworks.com Twitter http://twitter.com/lucidworks LinkedIn: https://www.linkedin.com/in/sarkaramrit2 On Thu, Jun 1, 2017 at 11:52 AM, Sathyam <sathyam.dorasw...@gmail.com> wrote: > HI, > > I am indexing documents to a 10 shard collection (testcollection, having no > replicas) in solr6 cluster using CloudSolrClient. I saw that there is a lot > of peer to peer document distribution going on when I looked at the solr > logs. > > An example log statement is as follows: > 2017-06-01 06:07:28.378 INFO (qtp1358444045-3673692) [c:testcollection > s:shard8 r:core_node7 x:testcollection_shard8_replica1] > o.a.s.u.p.LogUpdateProcessorFactory [testcollection_shard8_replica1] > webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from= > http://10.199.42.29:8983/solr/testcollection_shard7_ > replica1/&wt=javabin&version=2}{add=[BQECDwZGTCEBHZZBBiIP > (1568981383488995328), BQEBBQZB2il3wGT/0/mB (1568981383490043904), > BQEBBQZFnhOJRj+m9RJC (1568981383491092480), BQEGBgZIeBE1klHS4fxk > (1568981383492141056), BQEBBQZFVTmRx2VuCgfV (1568981383493189632)]} 0 25 > > When I went through the code of CloudSolrClient on grepcode I saw that the > client itself finds out which server it needs to hit by using the message > id hash and getting the shard range information from state.json. > Then it is quite confusing to me why there is a distribution of data > between peers as there is no replication and each shard is a leader. > > I would like to know why this is happening and how to avoid it or if the > above log statement means something else and I am misinterpreting > something. > > -- > Sathyam Doraswamy >