> On 5 Sep 2022, at 22:02, Henrik Ingo <henrik.i...@datastax.com> wrote:
> 
> Mostly I just wanted to ack that at least someone read the doc (somewhat 
> superficially sure, but some parts with thought...)
> 

Thanks, it's a lot to digest, so we appreciate that people are working through 
it. 
> One pre-feature that we would include in the preceding minor release is a 
> node level switch to disable all operations that modify cluster metadata 
> state. This would include schema changes as well as topology-altering events 
> like move, decommission or (gossip-based) bootstrap and would be activated on 
> all nodes for the duration of the major upgrade. If this switch were 
> accessible via internode messaging, activating it for an upgrade could be 
> automated. When an upgraded node starts up, it could send a request to 
> disable metadata changes to any peer still running the old version. This 
> would cost a few redundant messages, but simplify things operationally.
> Although this approach would necessitate an additional minor version upgrade, 
> this is not without precedent and we believe that the benefits outweigh the 
> costs of additional operational overhead.
> 
> Sounds like a great idea, and probably necessary in practice?
>  

Although I think we _could_ manage without this, it would certainly simplify 
this and future upgrades.
> If this part of the proposal is accepted, we could also include further 
> messaging protocol changes in the minor release, as these would largely 
> constitute additional verbs which would be implemented with no-op verb 
> handlers initially. This would simplify the major version code, as it would 
> not need to gate the sending of asynchronous replication messages on the 
> receiver's release version. During the migration, it may be useful to have a 
> way to directly inject gossip messages into the cluster, in case the states 
> of the yet-to-be upgraded nodes become inconsistent. This isn't intended, so 
> such a tool may never be required, but we have seen that gossip propagation 
> can be difficult to reason about at times.
> 
> Others will know the code better and I understand that adding new no-op verbs 
> can be considered safe... But instinctively a bit hesitant on this one. 
> Surely adding a few if statements to the upgraded version isn't that big of a 
> deal?
> 
> Also, it should make sense to minimize the dependencies from the previous 
> major version (without CEP-21) to the new major version (with CEP-21). If a 
> bug is found, it's much easier to fix code in the new major version than the 
> old and supposedly stable one.
> 

Yep, agreed. Adding verb handlers in advance may not buy us very much, so may 
not be worth the risk of additionally perturbing the stable system. I would say 
that having a means to directly manipulate gossip state during the upgrade 
would be a useful safety net in case something unforeseen occurs and we need to 
dig ourselves out of a hole. The precise scope of the feature & required 
changes are not something we've given extensive thought to yet, so we'd want to 
assess that carefully before proceeding.

> henrik
> 
> -- 
> Henrik Ingo
> +358 40 569 7354 <tel:358405697354>
>  <https://www.datastax.com/>   <https://twitter.com/DataStaxEng>   
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.youtube.com_channel_UCqA6zOSMpQ55vvguq4Y0jAg&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=bmIfaie9O3fWJAu6lESvWj3HajV4VFwgwgVuKmxKZmE&s=16sY48_kvIb7sRQORknZrr3V8iLTfemFKbMVNZhdwgw&e=>
>    <https://www.linkedin.com/in/heingo/>

Reply via email to