thelabdude opened a new pull request #2114:
URL: https://github.com/apache/lucene-solr/pull/2114


   # Description
   
   This is the backport to 8x from master, original PR was #2010 
   See JIRA for description of the issue: 
https://issues.apache.org/jira/browse/SOLR-12182
   
   # Solution
   
   This PR computes the `base_url` for a Replica using the stored `node_name` 
and a global `urlScheme` rather than storing the `base_url` in `state.json`. 
This avoids storing an incorrect URL scheme for replicas in persistent storage. 
The `base_url` is computed when read back from ZK and dropped when marshaling 
the Replica state to JSON. This also means we don't need a migration tool as 
stored state is "healed" on-the-fly when read back from ZK.
   
   The unfortunate aspect of this PR is we need to keep the URL scheme for the 
cluster in a global variable (so that it is available when reading from ZK). 
The global `urlScheme` still comes from the cluster property but is then stored 
in a global singleton, see: `org.apache.solr.common.cloud.UrlScheme`. 
Alternatively, we could just keep the `urlScheme` in a static in ZkStateReader, 
I felt the global singleton `UrlScheme.INSTANCE` made it clearer that this was 
a global thing but it also made more sense with my first implementation that 
tried to make rolling restart upgrades to TLS less chaotic. It's a trivial 
change to move all this over to ZkStateReader and remove UrlScheme.
   
   I initially tried setting a `ThreadLocal` that gives access to the 
`urlScheme` whenever we need to read these props from ZK. However, that ended 
up being problematic because we tend to read ZkNodeProps from ZK in many 
places. In reality, the `urlScheme` really is an immutable global variable that 
should be set once during initialization by reading from the cluster property 
stored in ZK. So I felt trying to treat this global as something that was 
highly dynamic made the code overly cumbersome. Put simply, we shouldn't 
support `urlScheme` changing in a live node after initialization, it's bad for 
business.
   
   I also tried to get rid of the `urlScheme` cluster property (re: 
https://issues.apache.org/jira/browse/SOLR-10202) but I'm not sure how 
SolrCloud client applications can resolve the correct `urlScheme` for the 
cluster without this property? On the server-side, sure we can just get the 
`urlScheme` from a Java System Property, but that won't be set for remote 
client applications that initialize via a connection to ZooKeeper. So I'm 
keeping the cluster property `urlScheme` for now.
   
   We also need to consider how to enable TLS on an existing cluster (with 
active collections) using a rolling restart process. The current 
`org.apache.solr.cloud.SSLMigrationTest` just stopped all test nodes at once 
and then brought them back with TLS enabled.
   
   Based on feedback, I've since removed the option to pull the active 
urlScheme from live nodes as we're not able to ensure zero-downtime when moving 
from `http` -> `https` for clusters with existing collections and live traffic. 
Put simply, the feature was a bit trappy in that it tried to reduce chaos when 
doing a rolling restart to enable TLS, but it made no guarantees. Thus, users 
just need to be sure to enable TLS before building production clusters!
   
   Lastly, I've tried to clean-up some of the places that access the baseUrl on 
replicas to be more consistent, so you'll see some of that in this PR as well.
   
   # Tests
   
   Many existing tests cover regression caused by these code changes. Added 
simple unit test for UrlScheme.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `master` branch.
   - [x] I have run `./gradlew check`.
   - [x] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to