A "factor" is multiplied, so multiplying the leader by a replicationFactor of 1 means you have exactly one copy of that shard.
I think that recycling the term "replication" within Solr was confusing, but it is a bit late to change that. wunder On Jan 3, 2013, at 7:33 AM, Mark Miller wrote: > This has pretty much become the standard across other distributed systems and > in the literat…err…books. > > I first implemented it as you mention you'd like, but Yonik correctly pointed > out that we were going against the grain. > > - Mark > > On Jan 3, 2013, at 10:01 AM, Per Steffensen <[email protected]> wrote: > >> For the same reasons that "Replica" shouldnt be called "Replica" (it >> requires to long an explanation to agree that it is an ok name), >> "replicationFactor" shouldnt be called "replicationFactor" and long as it >> referes to the TOTAL number of cores you get for your "Shard". >> "replicationFactor" would be an ok name if replicationFactor=0 meant one >> core, replicationFactor=1 meant two cores etc., but as long as >> replicationFactor=1 means one core, replicationFactor=2 means two cores, it >> is bad naming (you will not get any replication with replicationFactor=1 - >> WTF!?!?). If we want to insist that you specify the total number of cores at >> least use "replicaPerShard" instead of "replicationFactor", or even better >> rename "Replica" to "Shard-instance" and use "instancesPerShard" instead of >> "replicationFactor". >> >> Regards, Per Steffensen >> >> On 1/3/13 3:52 PM, Per Steffensen wrote: >>> Hi >>> >>> Here is my version - do not believe the explanations have been very clear >>> >>> We have the following concepts (here I will try to explain what each the >>> concept cover without naming it - its hard) >>> 1) Machines (virtual or physical) running Solr server JVMs (one machine can >>> run several Solr server JVMs if you like) >>> 2) Solr server JVMs >>> 3) Logical "stores" where you can add/update/delete data-instances (closest >>> to "logical" tables in RDBMS) >>> 4) Logical "slices" of a store (closest to non-overlapping "logical" sets >>> of rows for the "logical" table in a RDBMS) >>> 5) Physical instances of "slices" (a physical (disk/memory) instance of the >>> a "logical" slice). This is where data actually goes on disk - the logical >>> "stores" and "slices" above are just non-physical concepts >>> >>> Terminology >>> 1) Believe we have no name for this (except of course machine :-) ), even >>> though Jack claims that this is called a "node". Maybe sometimes it is >>> called a "node", but I believe "node" is more often used to refer to a >>> "Solr server JVM". >>> 2) "Node" >>> 3) "Collection" >>> 4) "Shard". Used to be called "Slice" but I believe now it is officially >>> called "Shard". I agree with that change, because I believe most of the >>> industry also uses the term "Shard" for this logical/non-physical concept >>> - just needs to be reflected it across documentation and code >>> 5) "Replica". Used to be called "Shard" but I believe now it is officially >>> called "Replica". I certainly do not agree with the name "Replica", because >>> it suggests that it is a copy of an "original", but it isnt. I would prefer >>> "Shard-instance" here, to avoid the confusion. I understand that you can >>> argue (if you argue long enough) that "Replica" is a fine name, but you >>> really need the explanation to understand why "Replica" can be defended as >>> the name for this. Is is not immediately obvious what this is as long as it >>> is called "Replica". A "Replica" is basically a Solr Cloud managed Core and >>> behind every Replica/Core lives a physical Lucene index. So Replica=Core) >>> contains/maintains Lucene index behind the scenes. The term "Replica" also >>> needs to be reflected across documentation and code. >>> >>> Regards, Per Steffensen >> > -- Walter Underwood [email protected]
