[ https://issues.apache.org/jira/browse/SOLR-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17088685#comment-17088685 ]
Gus Heck commented on SOLR-12182: --------------------------------- The suggested scheme-less base url is going to imply additional string manipulation and at least one more string object of garbage per shard per request. If we try to cache the manipulated string somewhere other than zookeeper that's bad duplicated state. I think this should be solved with a combination of # a script to upgrade all URLs zookeeper # tolerant code that only runs if a cluster property is set (default not set) Something like {code:java}/bin/solr migrate (http|https){code} Thus the upgrade case is take the cluster down or set the cluster prop, run the upgrade and then unset the cluster property or reboot it. The bit I don't know without a little research is how quickly the cluster property is likely to get re-read by the nodes. A quick scan of one existing system shows there are urls in state.json and /leaders. I'm not sure off the top of my head if anywhere else might might contain URL's that need adjusted, but that should be researched. As I type this I wonder about the wisdom of the tolerant code state at all (including not storing the scheme). It adds a rarely used code path to deal with legacy/incorrect schemes that probably needs to be tested by setting the cluster prop (or not) randomly on all tests to ensure we catch issues in the cluster-prop case and may be quite difficult to test comprehensively in the scheme-less case. As we accumulate more such randomized cases the number of test runs to ensure a good build rises, especially if the randomized features might only fail in cases where two specific random states are set. In any case it's a code path very likely to be forgotten in implementation of future features and still adds one (tiny) increment of processing. Switching to (or away from) HTTPS is major change, and maybe it's just better to require down time to keep the code simple. Any tolerant code needs to be sure it is applied (as a utility function) to every area of the code that could possibly want to read the URL, if we miss something, terrible subtle bugs might arise. Also, what if 3rd party code is checking the urls in zookeeper for custom components/plugins? Scheme-less URL's could be a breaking change from that perspective. > Can not switch urlScheme in 7x if there are any cores in the cluster > -------------------------------------------------------------------- > > Key: SOLR-12182 > URL: https://issues.apache.org/jira/browse/SOLR-12182 > Project: Solr > Issue Type: Bug > Affects Versions: 7.0, 7.1, 7.2 > Reporter: Anshum Gupta > Priority: Major > Attachments: SOLR-12182.patch > > > I was trying to enable TLS on a cluster that was already in use i.e. had > existing collections and ended up with down cores, that wouldn't come up and > the following core init errors in the logs: > *org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: > replica with coreNodeName core_node4 exists but with a different name or > base_url.* > What is happening here is that the core/replica is defined in the > clusterstate with the urlScheme as part of it's base URL e.g. > *"base_url":"http:hostname:port/solr"*. > Switching the urlScheme in Solr breaks this convention as the host now uses > HTTPS instead. > Actually, I ran into this with an older version because I was running with > *legacyCloud=false* and then realized that we switched that to the default > behavior only in 7x i.e while most users did not hit this issue with older > versions, unless they overrode the legacyCloud value explicitly, users > running 7x are bound to run into this more often. > Switching the value of legacyCloud to true, bouncing the cluster so that the > clusterstate gets flushed, and then setting it back to false is a workaround > but a bit risky one if you don't know if you have any old cores lying around. > Ideally, I think we shouldn't prepend the urlScheme to the base_url value and > use the urlScheme on the fly to construct it. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org