Broken downgrading can be fixed (I think) by modifying the SearializationHeader.toHeader() method where it currently throws an UnknownColumnException. If we can, instead of throwing the exception, create a dropped column for the unexpected column then I think the code will work.
I realise that to do this in the wild is not possible as it is a change to released code, but we could handle it going forward. On Wed, Feb 22, 2023 at 11:21 PM Henrik Ingo <henrik.i...@datastax.com> wrote: > ... ok apparently shift+enter sends messages now? > > I was just saying if at least the file format AND system/tables - anything > written to disk - can be protected with a switch, then it allows for quick > downgrade by shutting down the entire cluster and restarting with the > downgraded binary. It's a start. > > To be able to do that live in a distributed system needs to consider much > more: gossip, streaming, drivers, and ultimately all features, because we > don't' want an application developer to use a shiny new thing that a) may > not be available on all nodes, or b) may disappear if the cluster has to be > downgraded later. > > henrik > > On Thu, Feb 23, 2023 at 1:14 AM Henrik Ingo <henrik.i...@datastax.com> > wrote: > >> Just this once I'm going to be really brief :-) >> >> Just wanted to share for reference how Mongodb implemented >> downgradeability around their 4.4 version: >> https://www.mongodb.com/docs/manual/release-notes/6.0-downgrade-sharded-cluster/ >> >> Jeff you're right. Ultimately this is about more than file formats. >> However, ideally if at least the >> >> On Mon, Feb 20, 2023 at 10:02 PM Jeff Jirsa <jji...@gmail.com> wrote: >> >>> I'm not even convinced even 8110 addresses this - just writing sstables >>> in old versions won't help if we ever add things like new types or new >>> types of collections without other control abilities. Claude's other email >>> in another thread a few hours ago talks about some of these surprises - >>> "Specifically during the 3.1 -> 4.0 changes a column broadcast_port was >>> added to system/local. This means that 3.1 system can not read the table >>> as it has no definition for it. I tried marking the column for deletion in >>> the metadata and in the serialization header. The later got past the >>> column not found problem, but I suspect that it just means that data >>> columns after broadcast_port shifted and so incorrectly read." - this is a >>> harder problem to solve than just versioning sstables and network >>> protocols. >>> >>> Stepping back a bit, we have downgrade ability listed as a goal, but >>> it's not (as far as I can tell) universally enforced, nor is it clear at >>> which point we will be able to concretely say "this release can be >>> downgraded to X". Until we actually define and agree that this is a >>> real goal with a concrete version where downgrade-ability becomes real, it >>> feels like things are somewhat arbitrarily enforced, which is probably very >>> frustrating for people trying to commit work/tickets. >>> >>> - Jeff >>> >>> >>> >>> On Mon, Feb 20, 2023 at 11:48 AM Dinesh Joshi <djo...@apache.org> wrote: >>> >>>> I’m a big fan of maintaining backward compatibility. Downgradability >>>> implies that we could potentially roll back an upgrade at any time. While I >>>> don’t think we need to retain the ability to downgrade in perpetuity it >>>> would be a good objective to maintain strict backward compatibility and >>>> therefore downgradability until a certain point. This would imply >>>> versioning metadata and extending it in such a way that prior version(s) >>>> could continue functioning. This can certainly be expensive to implement >>>> and might bloat on-disk storage. However, we could always offer an option >>>> for the operator to optimize the on-disk structures for the current version >>>> then we can rewrite them in the latest version. This optimizes the storage >>>> and opens up new functionality. This means new features that can work with >>>> old on-disk structures will be available while others that strictly require >>>> new versions of the data structures will be unavailable until the operator >>>> migrates to the new version. This migration IMO should be irreversible. >>>> Beyond this point the operator will lose the ability to downgrade which is >>>> ok. >>>> >>>> Dinesh >>>> >>>> On Feb 20, 2023, at 10:40 AM, Jake Luciani <jak...@gmail.com> wrote: >>>> >>>> >>>> There has been progress on >>>> >>>> https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-8928 >>>> >>>> Which is similar to what datastax does for DSE. Would this be an >>>> acceptable solution? >>>> >>>> Jake >>>> >>>> On Mon, Feb 20, 2023 at 11:17 AM guo Maxwell <cclive1...@gmail.com> >>>> wrote: >>>> >>>>> It seems “An alternative solution is to implement/complete >>>>> CASSANDRA-8110 <https://issues.apache.org/jira/browse/CASSANDRA-8110>” >>>>> can give us more options if it is finished😉 >>>>> >>>>> Branimir Lambov <blam...@apache.org>于2023年2月20日 周一下午11:03写道: >>>>> >>>>>> Hi everyone, >>>>>> >>>>>> There has been a discussion lately about changes to the sstable >>>>>> format in the context of being able to abort a cluster upgrade, and the >>>>>> fact that changes to sstables can prevent downgraded nodes from reading >>>>>> any >>>>>> data written during their temporary operation with the new version. >>>>>> >>>>>> Most of the discussion is in CASSANDRA-18134 >>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-18134>, and is >>>>>> spreading into CASSANDRA-14277 >>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-14227> and >>>>>> CASSANDRA-17698 >>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-17698>, none of >>>>>> which is a good place to discuss the topic seriously. >>>>>> >>>>>> Downgradability is a worthy goal and is listed in the current >>>>>> roadmap. I would like to open a discussion here on how it would be >>>>>> achieved. >>>>>> >>>>>> My understanding of what has been suggested so far translates to: >>>>>> - avoid changes to sstable formats; >>>>>> - if there are changes, implement them in a way that is >>>>>> backwards-compatible, e.g. by duplicating data, so that a new version is >>>>>> presented in a component or portion of a component that legacy nodes will >>>>>> not try to read; >>>>>> - if the latter is not feasible, make sure the changes are only >>>>>> applied if a feature flag has been enabled. >>>>>> >>>>>> To me this approach introduces several risks: >>>>>> - it bloats file and parsing complexity; >>>>>> - it discourages improvement (e.g. CASSANDRA-17698 is no longer a LHF >>>>>> ticket once this requirement is in place); >>>>>> - it needs care to avoid risky solutions to address technical issues >>>>>> with the format versioning (e.g. staying on n-versions for 5.0 and >>>>>> needing >>>>>> a bump for a 4.1 bugfix might require porting over support for new >>>>>> features); >>>>>> - it requires separate and uncoordinated solutions to the problem and >>>>>> switching mechanisms for each individual change. >>>>>> >>>>>> An alternative solution is to implement/complete CASSANDRA-8110 >>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-8110>, which >>>>>> provides a method of writing sstables for a target version. During >>>>>> upgrades, a node could be set to produce sstables corresponding to the >>>>>> older version, and there is a very straightforward way to implement >>>>>> modifications to formats like the tickets above to conform to its >>>>>> requirements. >>>>>> >>>>>> What do people think should be the way forward? >>>>>> >>>>>> Regards, >>>>>> Branimir >>>>>> >>>>>> >>>>>> -- >>>>> you are the apple of my eye ! >>>>> >>>> -- >>>> http://twitter.com/tjake >>>> >>>> >> >> -- >> >> Henrik Ingo >> >> c. +358 40 569 7354 >> >> w. www.datastax.com >> >> <https://www.facebook.com/datastax> <https://twitter.com/datastax> >> <https://www.linkedin.com/company/datastax/> >> <https://github.com/datastax/> >> >> > > -- > > Henrik Ingo > > c. +358 40 569 7354 > > w. www.datastax.com > > <https://www.facebook.com/datastax> <https://twitter.com/datastax> > <https://www.linkedin.com/company/datastax/> > <https://github.com/datastax/> > >