I _think_ I have this right... ReplicationFactor is the maximum number of extra replicas per shard. If you don't specify this, then as you bring up more and more nodes, the new nodes get assigned on a round-robin basis to shards. This allows you to have heterogeneous collections and not have _all_ of them replicated to _all_ nodes.
So imagine you have 10 nodes, 2 shards. Without specifying a replication factor, you would have 5 nodes/shard. Now suppose you create a new collection with 2 shards and a replicationFactor of 2. The new collection will have 3 nodes per shard (replicationFactor + 1) and no nodes from your new collection will be assigned to 4 of your nodes. So in your case, you'll keep getting nodes assigned to your two shards (round robin) until you have sixteen machines running. The 17th machine won't get any shards from your collection assigned, it'll be "spare" until you do something explicit with it. If nobody corrects me, I'll add some detail to the Wiki.... Best Erick On Thu, Oct 4, 2012 at 9:18 PM, Sudhakar Maddineni <maddineni...@gmail.com> wrote: > Hi, > > Appreciate if someone could provide some pointers/docx to find info about > replication factor. > > > > I see that the replication factor was mentioned in the wiki doc: > http://wiki.apache.org/solr/SolrCloud - Managing collections via the > Collections API - > http://localhost:8983/solr/admin/collections?action=CREATE&name=mycollection&numShards=3&replicationFactor=4 > . > > But, couldn't find much documentation on how it is actually going to work > in a sharded cluster setup. > > > > I have a cluster with 3 solr nodes and 2 shards [numShards=2] with the > following setup and I didn't specify any replicationFactor during the setup. > > > > shard1 <--> solr node1, *node3* > > shard2 <--> solr node2 > > > > So, when I added "*node3*" to the existing cluster, it was auto-assigned to > "shard1". > > > > Does that mean "*node3*" acting as a replica of "node1"? And, "node2" > didn't have any replica yet? > > > > If yes,what is the replication factor that i should provide in order to get > the documents in node2 replicated to other nodes? > > > What is the default replication factor if i don't specify any? > > > > Thanks, Sudhakar.