[ https://issues.apache.org/jira/browse/SOLR-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230270#comment-17230270 ]
Erick Erickson commented on SOLR-14986: --------------------------------------- It's a sticky wicket. Short form: I don't see any way to make the code "do the right thing" or to document under what conditions specifying various options will succeed. So I'm thinking of just changing the ref guide for the collections API CREATE and ADDREPLICA commands to warn that using *property.name* is an expert option that should only be used with a thorough understanding of Solr. IOW "Don't call us if you try to set these properties and it doesn't work". There are tests that create a collection with *property.dataDir=someDir* for example. Which works fine in the test, since it's creating a single 1x1 collection. However, I can specify an absolute path and allow Solr to use it by setting *-Dsolr.allowPaths=/tmp/eoe* and try to create a collection with these parameters *numShards=2&property.dataDir=/tmp/somedir* which times out, and you have to go to the log to find out why, and even then it's rather opaque: Caused by: org.apache.lucene.store.LockObtainFailedException: Lock held by this virtual machine: /private/tmp/eoe/index/write.lock Unsurprising since both replicas are pointing to the same dataDir. NOTE: if I use a relative path, things are fine. E.g. *numShards=2&property.dataDir=eoe* In that case, each replica has a dataDir underneath it called "eoe" but they're under separate nodes. [~romseygeek] Pinging you since you wrote 2 of the 3 tests that use this, what's your opinion? BTW, I'm very glad these tests are here because I could have introduced a horrible problem if people are relying on this behavior.: CollectionsAPISolrJTest.testCreateCollectionWithPropertyParam and CoreAdminCreateDiscoverTest.testInstanceDirAsPropertyParam [~shalin] wrote the other one, a loooong time ago so do you have an opinion either? CollectionsAPIDistributedZkTest.addReplicaTest So what I'm thinking now is that catching all the possibilities in the code is nearly impossible to get right, and it doesn't feel like it's a good use of time. Explaining when you can use even one of these "special" properties in the ref guide makes my head explode. It starts to look like this: "When you create a collection, if you specify a *property.dataDir* that is an absolute path, the operation will fail if Solr tries to create two replicas in the same Solr instance (which may or may not happen, depending on whether you have more replicas than Solr instances, or possibly because of any custom core placement rules). In that case the collection creation will time out and the solr log will contain the reason. BTW, if Solr happens to create only one replica per instance, the first time you use this property, the call will succeed. But when you try to CREATE another collection or ADDREPLICA and use the same dataDir, it will fail if another replica already exists on that Solr instance using that dataDir. If you use relative paths, Solr will create a dataDir under the replica's directory." Yuuuuuuuccccccckkkkkkk! And that's just one property I don't even want to think about interactions between multiple properties... So barring objections, I'll just change the ref guide. > Restrict the properties possible to define with "property.name=value" when > creating a collection > ------------------------------------------------------------------------------------------------ > > Key: SOLR-14986 > URL: https://issues.apache.org/jira/browse/SOLR-14986 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Erick Erickson > Assignee: Erick Erickson > Priority: Major > > This came to light when I was looking at two user-list questions where people > try to manually define core.properties to define _replicas_ in SolrCloud. > There are two related issues: > 1> You can do things like "action=CREATE&name=eoe&property.collection=blivet" > which results in an opaque error about "could not create replica....." I > propose we return a better error here like "property.collection should not be > specified when creating a collection". What do people think about the rest of > the auto-created properties on collection creation? > coreNodeName > collection.configName > name > numShards > shard > collection > replicaType > "name" seems to be OK to change, although i don't see anyplace anyone can > actually see it afterwards.... > 2> Change the ref guide to steer people away from attempting to manually > create a core.properties file to define cores/replicas in SolrCloud. There's > no warning on the "defining-core-properties.adoc" for instance. Additionally > there should be some kind of message on the collections API documentation > about not trying to set the properties in <1> on the CREATE command. > <2> used to actually work (apparently) with legacyCloud... -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org