This has pretty much become the standard across other distributed systems and in the literat…err…books.
I first implemented it as you mention you'd like, but Yonik correctly pointed out that we were going against the grain. - Mark On Jan 3, 2013, at 10:01 AM, Per Steffensen <st...@designware.dk> wrote: > For the same reasons that "Replica" shouldnt be called "Replica" (it requires > to long an explanation to agree that it is an ok name), "replicationFactor" > shouldnt be called "replicationFactor" and long as it referes to the TOTAL > number of cores you get for your "Shard". "replicationFactor" would be an ok > name if replicationFactor=0 meant one core, replicationFactor=1 meant two > cores etc., but as long as replicationFactor=1 means one core, > replicationFactor=2 means two cores, it is bad naming (you will not get any > replication with replicationFactor=1 - WTF!?!?). If we want to insist that > you specify the total number of cores at least use "replicaPerShard" instead > of "replicationFactor", or even better rename "Replica" to "Shard-instance" > and use "instancesPerShard" instead of "replicationFactor". > > Regards, Per Steffensen > > On 1/3/13 3:52 PM, Per Steffensen wrote: >> Hi >> >> Here is my version - do not believe the explanations have been very clear >> >> We have the following concepts (here I will try to explain what each the >> concept cover without naming it - its hard) >> 1) Machines (virtual or physical) running Solr server JVMs (one machine can >> run several Solr server JVMs if you like) >> 2) Solr server JVMs >> 3) Logical "stores" where you can add/update/delete data-instances (closest >> to "logical" tables in RDBMS) >> 4) Logical "slices" of a store (closest to non-overlapping "logical" sets of >> rows for the "logical" table in a RDBMS) >> 5) Physical instances of "slices" (a physical (disk/memory) instance of the >> a "logical" slice). This is where data actually goes on disk - the logical >> "stores" and "slices" above are just non-physical concepts >> >> Terminology >> 1) Believe we have no name for this (except of course machine :-) ), even >> though Jack claims that this is called a "node". Maybe sometimes it is >> called a "node", but I believe "node" is more often used to refer to a "Solr >> server JVM". >> 2) "Node" >> 3) "Collection" >> 4) "Shard". Used to be called "Slice" but I believe now it is officially >> called "Shard". I agree with that change, because I believe most of the >> industry also uses the term "Shard" for this logical/non-physical concept - >> just needs to be reflected it across documentation and code >> 5) "Replica". Used to be called "Shard" but I believe now it is officially >> called "Replica". I certainly do not agree with the name "Replica", because >> it suggests that it is a copy of an "original", but it isnt. I would prefer >> "Shard-instance" here, to avoid the confusion. I understand that you can >> argue (if you argue long enough) that "Replica" is a fine name, but you >> really need the explanation to understand why "Replica" can be defended as >> the name for this. Is is not immediately obvious what this is as long as it >> is called "Replica". A "Replica" is basically a Solr Cloud managed Core and >> behind every Replica/Core lives a physical Lucene index. So Replica=Core) >> contains/maintains Lucene index behind the scenes. The term "Replica" also >> needs to be reflected across documentation and code. >> >> Regards, Per Steffensen >