On Fri, Jan 4, 2013 at 2:26 AM, Per Steffensen <st...@designware.dk> wrote: > Our biggest problem is that we really havent decided once and for all and > made sure to reflect the decision consistently across code and > documentation. As long as we havnt I believe it is still ok to change our > minds.
IMO, I *think* it's settled: It's "collection consists of 1 or more shards, which each consist of one or more replicas". A *long* time ago (3 years actually), I tried to get "slice" used in place of shard just because "shard" was already used ambiguously by people for both physical and logical shards, but it never caught on, and as I recall no one could really agree on a set of terms that satisfied everyone. Attempting to replace "Replica" with something like "Shard Instance" could actually end up being worse since it's a mouthful and people would tend to shorten it to "shard" when talking about it. >From a practical standpoint, I don't think people will be confused by the current terminology once we document it well (we should probably start with collection/shard/replica). It's mostly an issue of when one goes looking for inconsistencies or things that might not make sense. And as has been pointed out, others use the exact same terminology: http://www.datastax.com/docs/1.0/cluster_architecture/replication In the *code* I have been migrating away from "shard" as the physical kind. I've also used "slice" as a synonym for logical shard in the code because of this mixed history of "shard" and since removing all remnants of the use of "shard" as physical all at once would be impractical. Anyone who works on the code should not be bothered by an extra synonym, and things will continue to be cleaned up over time. -Yonik http://lucidworks.com