I think that's correct, but only when creating a new collection. I don't know if the replication factor is considered after that (running more nodes that have a core with the collection name, or manually adding nodes to the collection), or if some nodes go down.
Also, please someone correct me if I'm wrong on this, I know there has been lots of changes recently in this area. Tomás On Fri, Oct 5, 2012 at 9:18 AM, Erick Erickson <erickerick...@gmail.com>wrote: > I _think_ I have this right... > > ReplicationFactor is the maximum number of extra replicas per shard. > If you don't > specify this, then as you bring up more and more nodes, the new nodes get > assigned on a round-robin basis to shards. This allows you to have > heterogeneous > collections and not have _all_ of them replicated to _all_ nodes. > > So imagine you have 10 nodes, 2 shards. Without specifying a replication > factor, > you would have 5 nodes/shard. > > Now suppose you create a new collection with 2 shards and a > replicationFactor of 2. > The new collection will have 3 nodes per shard (replicationFactor + 1) and > no > nodes from your new collection will be assigned to 4 of your nodes. > > So in your case, you'll keep getting nodes assigned to your two shards > (round robin) > until you have sixteen machines running. The 17th machine won't get any > shards > from your collection assigned, it'll be "spare" until you do something > explicit with it. > > If nobody corrects me, I'll add some detail to the Wiki.... > > Best > Erick > > On Thu, Oct 4, 2012 at 9:18 PM, Sudhakar Maddineni > <maddineni...@gmail.com> wrote: > > Hi, > > > > Appreciate if someone could provide some pointers/docx to find info > about > > replication factor. > > > > > > > > I see that the replication factor was mentioned in the wiki doc: > > http://wiki.apache.org/solr/SolrCloud - Managing collections via the > > Collections API - > > > http://localhost:8983/solr/admin/collections?action=CREATE&name=mycollection&numShards=3&replicationFactor=4 > > . > > > > But, couldn't find much documentation on how it is actually going to work > > in a sharded cluster setup. > > > > > > > > I have a cluster with 3 solr nodes and 2 shards [numShards=2] with the > > following setup and I didn't specify any replicationFactor during the > setup. > > > > > > > > shard1 <--> solr node1, *node3* > > > > shard2 <--> solr node2 > > > > > > > > So, when I added "*node3*" to the existing cluster, it was auto-assigned > to > > "shard1". > > > > > > > > Does that mean "*node3*" acting as a replica of "node1"? And, "node2" > > didn't have any replica yet? > > > > > > > > If yes,what is the replication factor that i should provide in order to > get > > the documents in node2 replicated to other nodes? > > > > > > What is the default replication factor if i don't specify any? > > > > > > > > Thanks, Sudhakar. >