[ 
https://issues.apache.org/jira/browse/SOLR-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230270#comment-17230270
 ] 

Erick Erickson commented on SOLR-14986:
---------------------------------------

It's a sticky wicket.

Short form: I don't see any way to make the code "do the right thing" or to 
document under what conditions specifying various options will succeed. So I'm 
thinking of just changing the ref guide for the collections API CREATE and 
ADDREPLICA commands to warn that using *property.name* is an expert option that 
should only be used with a thorough understanding of Solr.

IOW "Don't call us if you try to set these properties  and it doesn't work".

There are tests that create a collection with *property.dataDir=someDir* for 
example. Which works fine in the test, since it's creating a single 1x1 
collection.

However, I can specify an absolute path and allow Solr to use it by setting 
*-Dsolr.allowPaths=/tmp/eoe* and try to create a collection with these 
parameters

*numShards=2&property.dataDir=/tmp/somedir*

which times out, and you have to go to the log to find out why, and even then 
it's rather opaque:

Caused by: org.apache.lucene.store.LockObtainFailedException: Lock held by this 
virtual machine: /private/tmp/eoe/index/write.lock

Unsurprising since both replicas are pointing to the same dataDir. NOTE: if I 
use a relative path, things are fine. E.g.

*numShards=2&property.dataDir=eoe*

In that case, each replica has a dataDir underneath it called "eoe" but they're 
under separate nodes.

[~romseygeek] Pinging you since you wrote 2 of the 3 tests that use this, 
what's your opinion? BTW, I'm very glad these tests are here because I could 
have introduced a horrible problem if people are relying on this behavior.:

CollectionsAPISolrJTest.testCreateCollectionWithPropertyParam and 
CoreAdminCreateDiscoverTest.testInstanceDirAsPropertyParam

[~shalin] wrote the other one, a loooong time ago so do you have an opinion 
either? 

CollectionsAPIDistributedZkTest.addReplicaTest

 

So what I'm thinking now is that catching all the possibilities in the code is 
nearly impossible to get right, and it doesn't feel like it's a good use of 
time. Explaining when you can use even one of these "special" properties in the 
ref guide makes my head explode. It starts to look like this:

"When you create a collection, if you specify a *property.dataDir* that is an 
absolute path, the operation will fail if Solr tries to create two replicas in 
the same Solr instance (which may or may not happen, depending on whether you 
have more replicas than Solr instances, or possibly because of any custom core 
placement rules). In that case the collection creation will time out and the 
solr log will contain the reason. BTW, if Solr happens to create only one 
replica per instance, the first time you use this property, the call will 
succeed. But when you try to CREATE another collection or ADDREPLICA and use 
the same dataDir, it will fail if another replica already exists on that Solr 
instance using that dataDir. If you use relative paths, Solr will create a 
dataDir under the replica's directory." Yuuuuuuuccccccckkkkkkk!

And that's just one property I don't even want to think about interactions 
between multiple properties...

So barring objections, I'll just change the ref guide.

> Restrict the properties possible to define with "property.name=value" when 
> creating a collection
> ------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-14986
>                 URL: https://issues.apache.org/jira/browse/SOLR-14986
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Erick Erickson
>            Assignee: Erick Erickson
>            Priority: Major
>
> This came to light when I was looking at two user-list questions where people 
> try to manually define core.properties to define _replicas_ in SolrCloud. 
> There are two related issues:
> 1> You can do things like "action=CREATE&name=eoe&property.collection=blivet" 
> which results in an opaque error about "could not create replica....." I 
> propose we return a better error here like "property.collection should not be 
> specified when creating a collection". What do people think about the rest of 
> the auto-created properties on collection creation? 
> coreNodeName
> collection.configName
> name
> numShards
> shard
> collection
> replicaType
> "name" seems to be OK to change, although i don't see anyplace anyone can 
> actually see it afterwards....
> 2> Change the ref guide to steer people away from attempting to manually 
> create a core.properties file to define cores/replicas in SolrCloud. There's 
> no warning on the "defining-core-properties.adoc" for instance. Additionally 
> there should be some kind of message on the collections API documentation 
> about not trying to set the properties in <1> on the CREATE command.
> <2> used to actually work (apparently) with legacyCloud...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to