Oops... you're right, and before I started writing that response I had the
thought that these should be "shardDir", but even that is confused. I think
"replicaDir" or "collectionReplica" or "shardReplicaDir" or...
"collectionShardReplicaDir" - the latter is wordy, but is explicit. I'd
reserve "coreDir" for "old" Solr.
Maybe "collectionDir" is fine for single node, single shard, single replica
Solr, and would throw an error if number of shards or replicas was greater
than 1. Otherwise, "replicaDir" would be sufficient and brief.
I don't care so much exactly what the name is, so long as it accurately
conveys its meaning.
Just to be clear, although the more modern Solr term "collection" came into
use when SolrCloud was introduced, it is not solely a SolrCloud term. Even a
"single core" Solr is using a "collection" (that happens to be single-core
and single-shard and single-replica.) To wit, the stock Solr example, which
is not SolrCloud, is named "collection1".
-- Jack Krupansky
-----Original Message-----
From: Shawn Heisey
Sent: Monday, May 06, 2013 10:18 AM
To: solr-user@lucene.apache.org
Subject: Re: Duplicated Documents Across shards
On 5/6/2013 7:44 AM, Jack Krupansky wrote:
I think if we had a more compehensible term for a "collection
configuration directory", a lot of the confusion would go away. I mean,
what the heck is an "instance" anyway? How does "instanceDir" relate to
an "instance" of the Solr "server"? Sure, I know that it is the parent
directory of the collection configuration (conf directory) or a
"collection directory", but how would a mere mortal grok that? I mean,
"instance" sounds like it's at a higher level than the collection itself
- that's why people tend to think it's the same for all cores in a Solr
"instance".
We should reconsider the name of that term. My choice: collectionDir.
I think that might lead to just as much confusion as instanceDir,
because it's for a core, not a collection. A name like coreDir would
avoid that confusion.
If you actually are using collections, then you'll be using SolrCloud.
A SolrCloud installation with maxShardsPerNode>1 will have more than one
core for the same collection on each node, so collectionDir would be
very confusing.
I was initially thinking a good name would be coreConfDir or confDir,
but that only makes sense in situations where dataDir is also present.
The Collections API creates cores without a dataDir parameter, and many
solr.xml files are created manually without dataDir.
Thanks,
Shawn