Re: [DISCUSS] Experimental flagging (fork from Re-evaluate compaction defaults in 5.1/trunk)

2024-12-12 Thread Aleksey Yeshchenko
I don’t like ‘unstable’ either, albeit for a different reason, but I don’t 
think three is enough and fits, as we already have some features that don’t fit 
into either of (preview,beta,ga) - released but broken, released but dangerous, 
deprecated, removed.

For new features going forward, alpha (preview) -> beta -> GA works well enough.

But we also need an approved non-euphemism for features like MVs (I suggest 
‘broken’) and possibly a softer version of it ('dangerous') for our existing 
features that work fine in some narrow well-defined circumstances but will blow 
in your face if you don’t know exactly what you are doing.

These classifications are largely orthogonal.

Alpha(preview)->Beta->GA communicates readiness of a feature under development, 
with GA being the default final state for most features.

From there a feature can transition into ‘broken’ or ‘dangerous’ territory. 
Serious issues get uncovered (very) late sometimes. It is what it is.
And we do deprecate and remove functionality when it’s superseded.


> -1 on unstable. It's way too many words than are needed. Three is a
> magic number and fits:
> 
> Preview
> Beta
> GA

> On 11 Dec 2024, at 18:50, Josh McKenzie  wrote:
> 
> A structured, disciplined approach to graduating something from [Optional] -> 
> [Default] makes sense to me, similar to how we're talking about a structured 
> flow of [Preview] -> [Beta] -> [GA]. Having those clear stages gives us a 
> framework to define what requirements of stage transitions would be which'll 
> ideally lead to us producing higher quality, more predictable, more 
> consistent results for our end users.
> 
> For instance, requirements from [Optional] -> [Default] could be higher level 
> abstractions like:
> Confidence in stability
> Strong evidence to indicate superiority in majority of workloads (by count or 
> importance or size, etc)
> These are all things we kind of do implicitly and ad-hoc on the mailing list, 
> and I'm not looking to tie us down to any granular structure or specificity. 
> More thinking it could be useful for someone that's worked on something who 
> wonders "Huh. How do I take this from being optional to the default?" and 
> having an answer better than "reinvent the wheel every time and fling 
> spaghetti at the dev list and pray".
> 
> :)
> 
> 
> On Wed, Dec 11, 2024, at 1:04 PM, Paulo Motta wrote:
>> Thanks for bringing up this topic, Josh. 
>> 
>> Outside of the major features (ie. MV/SAI/TCM/Accord), one related 
>> discussion in this topic is: how can we "promote" small improvements in 
>> existing features from optional to default ?
>> 
>> It makes sense to have optimizations launched behind a feature flag 
>> initially (beta phase) while the improvement gets real world exposure, but I 
>> think we need a better way to promote these optimizations to default 
>> behavior on a regular cadence.
>> 
>> Take for example optimized repairs from CASSANDRA-16274. It was launched in 
>> 4.x as an optional feature gated behind a flag, ie. 
>> auto_optimise_full_repair_streams: false. 
>> 
>> I could be easily missing something, but is there a world where 
>> non-optimized repairs make sense once this optimization is proven to work ? 
>> I agree this is fine while the feature is maturing, but at some point we 
>> need to rip the bandaid and make the optimization default (and clearly 
>> communicate that). This would allow cleanup code toil of default behavior 
>> that is no longer being used, because everyone is enabling the improvement 
>> during deployment.
>> 
>> This is just one example to demonstrate the issue and I don't want this 
>> discussion to focus on this particular case, but I can think of other 
>> improvements launched as optional that are never made default.
>> 
>> I don't know if this should be continued to be addressed on a 
>> improvement-by-improvement basis or if we could have a more streamlined 
>> process to review and communicate these changes more consciously at every 
>> major release.
>> 
>> In the same way we open a loop when adding an optimized behavior behind a 
>> feature flag, I think we should have a process to close these loops by 
>> promoting these optimizations to default when it makes sense.
>> 
>> On Tue, Dec 10, 2024 at 2:10 PM Josh McKenzie > > wrote:
>> 
>> So some questions to test a world w/3 classifications (Preview, Beta, GA):
>> - What would we do with the current experimental features (MV's, JDK17, 
>> witnesses, etc)? Flag them as preview or beta as appropriate on a 
>> case-by-case basis and add runtime warnings / documentation where missing?
>> 
>> - What would we do in the future if a feature's GA and we discover a Very 
>> Big Problem with it that'll take some time to fix? Keep it GA but cut a 
>> hotfix release w/a bunch of warnings? Bounce it back to Preview? Leave it be 
>> and just feverishly try and fix it?
>> 
>>> for policy decisions like this (that don’t need to be agreed in advance) 

Re: [DISCUSS] Deprecation of IEndpointSnitch (CASSANDRA-19488)

2024-12-12 Thread Sam Tunnicliffe
This patch is probably now ready to merge, having been through several 
iterations of review and with green CI. Before that though, I just want to send 
one more reminder about it. We've endeavoured to preserve all existing 
behaviour and to keep configuration 100% backwards compatible. However, some 
areas have had minimal testing in real clusters, specifically the various cloud 
platform configurations: 

* Ec2Snitch/Ec2MultiRegionSnitch
* AzureSnitch
* AlibabaCloudSnitch
* GoogleCloudSnitch
* CloudstackSnitch

Any help in validating these in their native environments would be welcome.

The other consideration is toward custom snitch implementations. The intention 
is that these should continue to work without interruption or intervention, 
unless they're leaning heavily on C* internals in which case any changes 
required ought to be minimal. So it would be great if anyone using a custom 
snitch implementation is able to check it out and help verify that.


> On 31 Oct 2024, at 16:53, Sam Tunnicliffe  wrote:
> 
> Since CEP-21, the source of truth for topology info (a node's datacenter & 
> rack) is ClusterMetadata. Each node provides its dc/rack when it registers 
> itself with the cluster prior to joining and this information is effectively 
> immutable (for now). This significantly reduces the scope of 
> IEndpointSnitch's responsibilities and CASSANDRA-19488 proposes a refactoring 
> which breaks out the remaining functionality into a handful of new providers 
> (full details can be found in the JIRA). 
> 
> This is one of the more widely used extension points in Cassandra, so we 
> wanted to bring it to the mailing list in addition to discussing on JIRA. 
> 
> To be clear, no operator intervention should be necessary when upgrading. To 
> ease migration onto the new config and to allow us to deprecate snitches in a 
> controlled way, it will remain fully supported to configure nodes using the 
> endpoint_snitch setting in yaml. A SnitchAdapter acts as a facade in this 
> case, presenting the new interfaces to calling code while delegating to the 
> legacy snitch. Most of the in-tree snitches have been refactored to extract 
> implementations of the new interfaces so that their functionality can be used 
> via the new configuration.
> 
> Some questions for the list:
> 
> * We have added 2 new methods to IEndpointSnitch, which have essentially been 
> pulled up from Ec2MultiRegionSnitch and GossipingPropertyFileSnitch to 
> support ReconnectableSnitchHelper. Currently, these are added as default 
> methods on the interface so that out-of-tree snitches remain binary 
> compatible. However, it would be safer to break binary compatibility in this 
> case to ensure that any custom snitches out in the wild must be updated and 
> their behaviour is preserved. So the question is, would there be objections 
> to extending the (now deprecated) IEndpointSnitch interface in this way?
> 
> * Python dtests and config are currently unchanged (aside from some error 
> message checks) so these are exercising the path whereby the clusters are 
> configured with endpoint_snitch and make use of the compatibility adapter. 
> In-jvm upgrade dtests switch from old to new style configuration on upgrade 
> to 5.1 (though in truth, these don't exercise snitches much at all as a 
> special dtest snitch is used throughout). cassandra-latest.yaml contains the 
> new settings, while cassandra.yaml and the variations in test/conf retain the 
> old style settings. How should we approach updating these configs so that we 
> maintain a balance between test coverage, compatibility during upgrades and 
> encouraging the use of new style config in fresh clusters?
> 



Re: [DISCUSS] Experimental flagging (fork from Re-evaluate compaction defaults in 5.1/trunk)

2024-12-12 Thread Paulo Motta
> If we don’t intend to begin fixing the feature within the next year or so
we should deprecate it entirely.

+1 - this is probably topic for another thread but isn’t MVs fundamentally
solved with Accord? In my ignorance this is “just” a matter of adding an
Accord backend to MV syntax to fix it reliably.

On Thu, 12 Dec 2024 at 09:58 Benedict  wrote:

> I think alpha is fine. It communicates fairly well that there’s no near
> term expectation they will be production capable.
>
> There is (I think) still an intention to improve them, but they are janky.
> If we don’t intend to begin fixing the feature within the next year or so
> we should deprecate it entirely.
>
>
> On 12 Dec 2024, at 14:46, Aleksey Yeshchenko  wrote:
>
> But MVs are not alpha or preview, as they are not actively being worked
> on. They are currently broken. Calling them ‘alpha’ makes ‘alpha’
> overloaded and less useful.
>
>
>
> On 12 Dec 2024, at 14:00, Josh McKenzie  wrote:
>
> But we also need an approved non-euphemism for features like MVs (I
> suggest ‘broken’) and possibly a softer version of it ('dangerous') for
> our existing features that work fine in some narrow well-defined
> circumstances but will blow in your face if you don’t know exactly what you
> are doing.
>
> Feels like the real answer is:
>
>1. Endeavor to never get ourselves into this state
>2. Take immediate action if we discover we're there (fix feature if
>possible, deprecate and remove if not). Not "leave to fester for years"
>
> I like the introduction of 'alpha' as an alias for 'Preview'; not sure why
> that wasn't what we immediately came up with collectively given how
> widespread its usage is. :)
>
> What would demoting MV's to 'alpha' right now look like? We'd warn on
> their usage w/some different structure and verbiage, and it'd be pretty
> implicitly clear to people they shouldn't use it in production right?
>
> It seems to me that the 3 categories would be sufficient even to handle
> our current scenario where we have some things in the system that are a Bad
> Idea to use in production.
>
> On Thu, Dec 12, 2024, at 6:06 AM, Aleksey Yeshchenko wrote:
>
> I don’t like ‘unstable’ either, albeit for a different reason, but I don’t
> think three is enough and fits, as we already have some features that don’t
> fit into either of (preview,beta,ga) - released but broken, released but
> dangerous, deprecated, removed.
>
> For new features going forward, alpha (preview) -> beta -> GA works well
> enough.
>
> But we also need an approved non-euphemism for features like MVs (I
> suggest ‘broken’) and possibly a softer version of it ('dangerous') for
> our existing features that work fine in some narrow well-defined
> circumstances but will blow in your face if you don’t know exactly what you
> are doing.
>
> These classifications are largely orthogonal.
>
> Alpha(preview)->Beta->GA communicates readiness of a feature under
> development, with GA being the default final state for most features.
>
> From there a feature can transition into ‘broken’ or ‘dangerous’
> territory. Serious issues get uncovered (very) late sometimes. It is what
> it is.
> And we do deprecate and remove functionality when it’s superseded.
>
>
> -1 on unstable. It's way too many words than are needed. Three is a
> magic number and fits:
>
> Preview
> Beta
> GA
>
>
> On 11 Dec 2024, at 18:50, Josh McKenzie  wrote:
>
> A structured, disciplined approach to graduating something from [Optional]
> -> [Default] makes sense to me, similar to how we're talking about a
> structured flow of [Preview] -> [Beta] -> [GA]. Having those clear stages
> gives us a framework to define what requirements of stage transitions would
> be which'll ideally lead to us producing higher quality, more predictable,
> more consistent results for our end users.
>
> For instance, requirements from [Optional] -> [Default] could be higher
> level abstractions like:
>
>- Confidence in stability
>- Strong evidence to indicate superiority in majority of workloads (by
>count or importance or size, etc)
>
> These are all things we kind of do implicitly and ad-hoc on the mailing
> list, and I'm not looking to tie us down to any granular structure or
> specificity. More thinking it could be useful for someone that's worked on
> something who wonders "Huh. How do I take this from being optional to the
> default?" and having an answer better than "reinvent the wheel every time
> and fling spaghetti at the dev list and pray".
>
> :)
>
>
> On Wed, Dec 11, 2024, at 1:04 PM, Paulo Motta wrote:
>
> Thanks for bringing up this topic, Josh.
>
> Outside of the major features (ie. MV/SAI/TCM/Accord), one related
> discussion in this topic is: how can we "promote" small improvements in
> existing features from optional to default ?
>
> It makes sense to have optimizations launched behind a feature flag
> initially (beta phase) while the improvement gets real world exposure, but
> I think we need a better way to promote

Re: [DISCUSS] 5.1 should be 6.0

2024-12-12 Thread Sam Tunnicliffe
No, we initially tried to preserve all the previous paths and put the whole 
thing behind a feature flag, but it was just way too pervasive and doing so 
would've added years to the project. So for the period before the CMS is 
initialized, certain operations are not available. 

However, it should be entirely possible to downgrade and rollback to 5.0 after 
cutting over to TCM, as long as SSTables are still in the old format. By 
"should be" I mean it is absolutely possible and has been tested, but it 
requires the SCM to guard the on disk format, which has the unfortunate effect 
of limiting the messaging version and that in turn make it impossible to 
actually cut over to TCM. i.e. the testing has been done with a patch which 
disables some things which rely on messaging VERSION_51. This is why I want to 
remove the coupling between SCM and messaging version.

Also, I misspoke slightly in my previous email because I forgot that we did 
manage to enable a decent subsection of TCM to work with VERSION_40/VERSION_50. 
In this scenario, you still get the linearized schema updates via the metadata 
log but replicas/coordinators don't exchange epochs during reads/writes so the 
consistency guarantees are weakened.

Thanks,
Sam
 

> On 12 Dec 2024, at 16:17, Jeremiah Jordan  wrote:
> 
> My expectation is that in trunk SCM CASSANDRA_4 would change to SCM 
> CASSANDRA_5.  I think we should be striving to support full 
> downgrade/rollback ability to the previous major version from trunk.
> With TCM I would expect that when running in CASSANDRA_5 mode that 
> initializing TCM would not be possible, as once initialized you could no 
> longer roll back.
> Do we have no way to support the gossip paths continuing to work prior to 
> initializing TCM?
> 
> -Jeremiah
> 
> On Dec 11, 2024 at 7:41:48 AM, Sam Tunnicliffe  wrote:
>> My point is that the upgrade to 5.1/6.0 isn't really complete until the CMS 
>> is initialised and this can't be done while running with SCM CASSANDRA_4 
>> because of the messaging service limitation. Until that point, schema 
>> changes & node replacements are not supported which affects how long a bake 
>> time is tolerable. 
>> This specific issue could probably be fixed by revisiting the SCM 
>> implementation in 5.1/6.0, so we should certainly do that but the fact 
>> remains that we don't have great test coverage to indicate how clusters 
>> behave when running in SCM for a prolonged period.  
>> 
>> Thanks, 
>> Sam
>> 
>>> On 11 Dec 2024, at 13:29, Brandon Williams  wrote:
>>> 
>>> On Wed, Dec 11, 2024 at 7:22 AM Sam Tunnicliffe  wrote:
>>> >
>>> > so running in any SCM mode for a prolonged period is not really viable.
>>> 
>>> This is what many users want to do though, upgrade one DC and let it
>>> bake to see how it goes before continuing.  I don't think that's
>>> unreasonable, but from working on CASSANDRA-20118 I know how difficult
>>> that is already.  I don't think we've built enough SCM muscle yet to
>>> think about handling multiple previous versions.
>>> 
>>> Kind Regards,
>>> Brandon
>> 



Re: Supporting 2.2 -> 5.0 upgrades

2024-12-12 Thread Paulo Motta
>  I think that will not happen until we are out of Ant as doing this multi
jar / subproject mumbo jumbo is not too much appealing to ... anybody?

This is a contentious/controversial topic, but the more I work with gradle
the more I lean towards ant's simplicity. That said, I'd support moving
away if it becomes a technical blocker to break up cassandra-all - and if
this happen I would vote for maven as replacement. :-D

On Thu, Dec 12, 2024 at 11:42 AM Miklosovic, Stefan via dev <
dev@cassandra.apache.org> wrote:

> These are all good ideas but in practical terms I think that will not
> happen until we are out of Ant as doing this multi jar / subproject mumbo
> jumbo is not too much appealing to ... anybody?
>
> 
> From: Paulo Motta 
> Sent: Thursday, December 12, 2024 17:35
> To: dev@cassandra.apache.org
> Subject: Re: Supporting 2.2 -> 5.0 upgrades
>
> EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
>
>
>
> >  +1 on moving the read/write logic into its own jar.
>
> +1, not only read-write logic but anything used by both the server and
> subprojects (ie. cassandra-sidecar), for example JMX Mbeans and other
> interfaces.
>
> I think one way to do that would be to split cassandra-all into
> cassandra-server and cassandra-common (anything used by both subprojects
> and server), but not sure if this would be feasible or what it would take.
>
> If there's loose agreement this would be a feasible path I'd be happy to
> create a JIRA to investigate what this would take.
>
> On Thu, Dec 12, 2024 at 11:26 AM Doug Rohrer  droh...@apple.com>> wrote:
> +1 on moving the read/write logic into its own jar.
>
> Doug
>
> > On Dec 11, 2024, at 7:21 PM, David Capwell  dcapw...@apple.com>> wrote:
> >
> > From a disk format point of view the only thing I remember was the disk
> type bug with UDTs.  Bringing that logic back was hard as the type system
> (in 5.0) tries to avoid allowing construction of invalid states, and we
> would need to weaken that in order to enable the migration. Assuming the
> user migrated from 3.x to 4.x then the sstable metadata should have been
> rewritten to fix this bug.
> >
> > One thought (though know its a ton of effort).. we have talked about for
> a long time about moving the reading/writing logic into its jar (so tools
> don’t need cassandra-all and can limit the dependencies)… if we did that we
> could try to solve this as an out of process migration… have the 2.2 reader
> then write using 6.0 writer (ignoring compact storage… )…
> >
> >> On Dec 11, 2024, at 4:59 AM, Benedict  bened...@apache.org>> wrote:
> >>
> >> I think 3.11 supported upgrade from 2.2, but I haven’t checked. I am
> fairly sure 4.x supported upgrade from 3.0.x also.
> >>
> >>
> >>> On 11 Dec 2024, at 12:53, Miklosovic, Stefan via dev <
> dev@cassandra.apache.org> wrote:
> >>>
> >>> I see. That makes sense. I think that by 3.x you meant basically the
> latest 3.11, right? I guess 2.2 -> 3.0 already works, we would just try to
> support 2.2 -> 3.11 straight away. I need to check where we are at in that
> area.
> >>>
> >>> 
> >>> From: Benedict mailto:bened...@apache.org>>
> >>> Sent: Wednesday, December 11, 2024 13:09
> >>> To: dev@cassandra.apache.org
> >>> Cc: Miklosovic, Stefan; dev@cassandra.apache.org dev@cassandra.apache.org>; Miklosovic, Stefan
> >>> Subject: Re: Supporting 2.2 -> 5.0 upgrades
> >>>
> >>> EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
> >>>
> >>>
> >>>
> >>>
> >>> 2.2 is particularly hard because of the major storage format changes
> that took place.
> >>>
> >>> I think if we want to retain (restore) upgrade support from 3.x I
> would support that, but 2.x is probably too burdensome and likely to have
> too many hard edges.
> >>>
> >>> I think if users only had to upgrade 2.2->3.x then eg 3.x->6.0 that
> would be a pretty friendly upgrade path all things considered.
> >>>
>  On 11 Dec 2024, at 12:03, Miklosovic, Stefan via dev <
> dev@cassandra.apache.org> wrote:
> 
>  Hey,
> 
>  I want to fork the thread where we are mentioning that 2.2 -> 5.0
> would be cool to support.
> 
>  I was involved in checking that offline upgrades from 3.0 to 5.0 work
> and fixed few issues along the way (1), hence I can imagine that supporting
> 2.2 -> 5.0 would be basically the same thing just on steroids and more
> involved? Anyway, having a stab into this is not useless at all, I will at
> least go deep into the upgrade stuff I have never given a lot of thought to
> which is good learning experience.
> 
>  Any tips where to start? Was any progress done by anybody already in
> this matter to not start from zero?
> 
>  (1)
> https://urldefense.com/v3/__https://issues.apache.org/jira/browse/CASSANDRA-19002__;!!Nhn8V6BzJA!RFZoz6sQSrP_qLd0K_eNWO3UAc1s8mTT5SkFalUMwM7_l9g

Re: Supporting 2.2 -> 5.0 upgrades

2024-12-12 Thread Josh McKenzie
> will not happen until we are out of Ant as doing this multi jar / subproject 
> mumbo jumbo is not too much appealing to ... anybody?
There's some folks working on a CEP around our build system and supporting a 
shared library (came up in a thread on #cassandra-dev; that's the extent of my 
knowledge on it excepting that it's still ongoing). Let's sit tight and not 
derail this topic on that point.

Assuming a build system that facilitated clean multi-project dependency support 
and the ability to work on everything in one unified place, I think there's 
still value in enumerating what sections off the DB we might want to extract in 
order to make the upgrade support more robust and extensible (as per this 
thread's original point). 

On Thu, Dec 12, 2024, at 11:53 AM, Paulo Motta wrote:
> >  I think that will not happen until we are out of Ant as doing this multi 
> > jar / subproject mumbo jumbo is not too much appealing to ... anybody?
> 
> This is a contentious/controversial topic, but the more I work with gradle 
> the more I lean towards ant's simplicity. That said, I'd support moving away 
> if it becomes a technical blocker to break up cassandra-all - and if this 
> happen I would vote for maven as replacement. :-D
> 
> On Thu, Dec 12, 2024 at 11:42 AM Miklosovic, Stefan via dev 
>  wrote:
>> These are all good ideas but in practical terms I think that will not happen 
>> until we are out of Ant as doing this multi jar / subproject mumbo jumbo is 
>> not too much appealing to ... anybody?
>> 
>> 
>> From: Paulo Motta 
>> Sent: Thursday, December 12, 2024 17:35
>> To: dev@cassandra.apache.org
>> Subject: Re: Supporting 2.2 -> 5.0 upgrades
>> 
>> EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
>> 
>> 
>> 
>> >  +1 on moving the read/write logic into its own jar.
>> 
>> +1, not only read-write logic but anything used by both the server and 
>> subprojects (ie. cassandra-sidecar), for example JMX Mbeans and other 
>> interfaces.
>> 
>> I think one way to do that would be to split cassandra-all into 
>> cassandra-server and cassandra-common (anything used by both subprojects and 
>> server), but not sure if this would be feasible or what it would take.
>> 
>> If there's loose agreement this would be a feasible path I'd be happy to 
>> create a JIRA to investigate what this would take.
>> 
>> On Thu, Dec 12, 2024 at 11:26 AM Doug Rohrer 
>> mailto:droh...@apple.com>> wrote:
>> +1 on moving the read/write logic into its own jar.
>> 
>> Doug
>> 
>> > On Dec 11, 2024, at 7:21 PM, David Capwell 
>> > mailto:dcapw...@apple.com>> wrote:
>> >
>> > From a disk format point of view the only thing I remember was the disk 
>> > type bug with UDTs.  Bringing that logic back was hard as the type system 
>> > (in 5.0) tries to avoid allowing construction of invalid states, and we 
>> > would need to weaken that in order to enable the migration. Assuming the 
>> > user migrated from 3.x to 4.x then the sstable metadata should have been 
>> > rewritten to fix this bug.
>> >
>> > One thought (though know its a ton of effort).. we have talked about for a 
>> > long time about moving the reading/writing logic into its jar (so tools 
>> > don’t need cassandra-all and can limit the dependencies)… if we did that 
>> > we could try to solve this as an out of process migration… have the 2.2 
>> > reader then write using 6.0 writer (ignoring compact storage… )…
>> >
>> >> On Dec 11, 2024, at 4:59 AM, Benedict 
>> >> mailto:bened...@apache.org>> wrote:
>> >>
>> >> I think 3.11 supported upgrade from 2.2, but I haven’t checked. I am 
>> >> fairly sure 4.x supported upgrade from 3.0.x also.
>> >>
>> >>
>> >>> On 11 Dec 2024, at 12:53, Miklosovic, Stefan via dev 
>> >>> mailto:dev@cassandra.apache.org>> wrote:
>> >>>
>> >>> I see. That makes sense. I think that by 3.x you meant basically the 
>> >>> latest 3.11, right? I guess 2.2 -> 3.0 already works, we would just try 
>> >>> to support 2.2 -> 3.11 straight away. I need to check where we are at in 
>> >>> that area.
>> >>>
>> >>> 
>> >>> From: Benedict mailto:bened...@apache.org>>
>> >>> Sent: Wednesday, December 11, 2024 13:09
>> >>> To: dev@cassandra.apache.org
>> >>> Cc: Miklosovic, Stefan; 
>> >>> dev@cassandra.apache.org; Miklosovic, 
>> >>> Stefan
>> >>> Subject: Re: Supporting 2.2 -> 5.0 upgrades
>> >>>
>> >>> EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> 2.2 is particularly hard because of the major storage format changes 
>> >>> that took place.
>> >>>
>> >>> I think if we want to retain (restore) upgrade support from 3.x I would 
>> >>> support that, but 2.x is probably too burdensome and likely to have too 
>> >>> many hard edges.
>> >>>
>> >>> I think if users only had to upgrade 2.2->3.x then eg 3.x->6.0 that 
>> >>> would be a pretty friendly upgrade pa

Re: Supporting 2.2 -> 5.0 upgrades

2024-12-12 Thread Benedict
Why would ant get in the way? We already build multiple jars, and accord will be a submodule. We have far more organisational issues to overcome than ant.I have for a while advocated for a shared lib to also share between Harry, accord, dtests etcI am however not 100% sure about splitting read/write path, at least not as first posited. The idea of maintaining it as an API for dropping in different jars is a whole other world of potential pain I don’t want to countenance. Supporting eg bulk readers or writers or other integrations seems pretty feasible though.On 12 Dec 2024, at 16:53, Paulo Motta  wrote:> 

I think that will not happen until we are out of Ant as doing this multi jar / subproject mumbo jumbo is not too much appealing to ... anybody?This is a contentious/controversial topic, but the more I work with gradle the more I lean towards ant's simplicity. That said, I'd support moving away if it becomes a technical blocker to break up cassandra-all - and if this happen I would vote for maven as replacement. :-DOn Thu, Dec 12, 2024 at 11:42 AM Miklosovic, Stefan via dev  wrote:These are all good ideas but in practical terms I think that will not happen until we are out of Ant as doing this multi jar / subproject mumbo jumbo is not too much appealing to ... anybody?


From: Paulo Motta 
Sent: Thursday, December 12, 2024 17:35
To: dev@cassandra.apache.org
Subject: Re: Supporting 2.2 -> 5.0 upgrades

EXTERNAL EMAIL - USE CAUTION when clicking links or attachments



>  +1 on moving the read/write logic into its own jar.

+1, not only read-write logic but anything used by both the server and subprojects (ie. cassandra-sidecar), for example JMX Mbeans and other interfaces.

I think one way to do that would be to split cassandra-all into cassandra-server and cassandra-common (anything used by both subprojects and server), but not sure if this would be feasible or what it would take.

If there's loose agreement this would be a feasible path I'd be happy to create a JIRA to investigate what this would take.

On Thu, Dec 12, 2024 at 11:26 AM Doug Rohrer > wrote:
+1 on moving the read/write logic into its own jar.

Doug

> On Dec 11, 2024, at 7:21 PM, David Capwell > wrote:
>
> From a disk format point of view the only thing I remember was the disk type bug with UDTs.  Bringing that logic back was hard as the type system (in 5.0) tries to avoid allowing construction of invalid states, and we would need to weaken that in order to enable the migration. Assuming the user migrated from 3.x to 4.x then the sstable metadata should have been rewritten to fix this bug.
>
> One thought (though know its a ton of effort).. we have talked about for a long time about moving the reading/writing logic into its jar (so tools don’t need cassandra-all and can limit the dependencies)… if we did that we could try to solve this as an out of process migration… have the 2.2 reader then write using 6.0 writer (ignoring compact storage… )…
>
>> On Dec 11, 2024, at 4:59 AM, Benedict > wrote:
>>
>> I think 3.11 supported upgrade from 2.2, but I haven’t checked. I am fairly sure 4.x supported upgrade from 3.0.x also.
>>
>>
>>> On 11 Dec 2024, at 12:53, Miklosovic, Stefan via dev > wrote:
>>>
>>> I see. That makes sense. I think that by 3.x you meant basically the latest 3.11, right? I guess 2.2 -> 3.0 already works, we would just try to support 2.2 -> 3.11 straight away. I need to check where we are at in that area.
>>>
>>> 
>>> From: Benedict >
>>> Sent: Wednesday, December 11, 2024 13:09
>>> To: dev@cassandra.apache.orgdev@cassandra.apache.org>
>>> Cc: Miklosovic, Stefan; dev@cassandra.apache.orgdev@cassandra.apache.org>; Miklosovic, Stefan
>>> Subject: Re: Supporting 2.2 -> 5.0 upgrades
>>>
>>> EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
>>>
>>>
>>>
>>>
>>> 2.2 is particularly hard because of the major storage format changes that took place.
>>>
>>> I think if we want to retain (restore) upgrade support from 3.x I would support that, but 2.x is probably too burdensome and likely to have too many hard edges.
>>>
>>> I think if users only had to upgrade 2.2->3.x then eg 3.x->6.0 that would be a pretty friendly upgrade path all things considered.
>>>
 On 11 Dec 2024, at 12:03, Miklosovic, Stefan via dev > wrote:

 Hey,

 I want to fork the thread where we are mentioning that 2.2 -> 5.0 would be cool to support.

 I was involved in checking that offline upgrades from 3.0 to 5.0 work and fixed few issues along the way (1), hence I can imagine that supporting 2.2 -> 5.0 would be basically the same thing just on steroids 

Re: Supporting 2.2 -> 5.0 upgrades

2024-12-12 Thread Alex Petrov
>  I think that will not happen until we are out of Ant as doing this multi jar 
> / subproject mumbo jumbo is not too much appealing to ... anybody?

I find it more or less equally painful to make a change in a large Gradle, 
Maven, or Ant project. I consider myself a pretty active contributor, but I 
still find myself very rarely interacting with ant, and when I did, it never 
caused any trouble. That said, there are many other important things that we 
can improve, so I'd focus on them. Same goes for submodules. 

On Thu, Dec 12, 2024, at 5:53 PM, Paulo Motta wrote:
> >  I think that will not happen until we are out of Ant as doing this multi 
> > jar / subproject mumbo jumbo is not too much appealing to ... anybody?
> 
> This is a contentious/controversial topic, but the more I work with gradle 
> the more I lean towards ant's simplicity. That said, I'd support moving away 
> if it becomes a technical blocker to break up cassandra-all - and if this 
> happen I would vote for maven as replacement. :-D
> 
> On Thu, Dec 12, 2024 at 11:42 AM Miklosovic, Stefan via dev 
>  wrote:
>> These are all good ideas but in practical terms I think that will not happen 
>> until we are out of Ant as doing this multi jar / subproject mumbo jumbo is 
>> not too much appealing to ... anybody?
>> 
>> 
>> From: Paulo Motta 
>> Sent: Thursday, December 12, 2024 17:35
>> To: dev@cassandra.apache.org
>> Subject: Re: Supporting 2.2 -> 5.0 upgrades
>> 
>> EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
>> 
>> 
>> 
>> >  +1 on moving the read/write logic into its own jar.
>> 
>> +1, not only read-write logic but anything used by both the server and 
>> subprojects (ie. cassandra-sidecar), for example JMX Mbeans and other 
>> interfaces.
>> 
>> I think one way to do that would be to split cassandra-all into 
>> cassandra-server and cassandra-common (anything used by both subprojects and 
>> server), but not sure if this would be feasible or what it would take.
>> 
>> If there's loose agreement this would be a feasible path I'd be happy to 
>> create a JIRA to investigate what this would take.
>> 
>> On Thu, Dec 12, 2024 at 11:26 AM Doug Rohrer 
>> mailto:droh...@apple.com>> wrote:
>> +1 on moving the read/write logic into its own jar.
>> 
>> Doug
>> 
>> > On Dec 11, 2024, at 7:21 PM, David Capwell 
>> > mailto:dcapw...@apple.com>> wrote:
>> >
>> > From a disk format point of view the only thing I remember was the disk 
>> > type bug with UDTs.  Bringing that logic back was hard as the type system 
>> > (in 5.0) tries to avoid allowing construction of invalid states, and we 
>> > would need to weaken that in order to enable the migration. Assuming the 
>> > user migrated from 3.x to 4.x then the sstable metadata should have been 
>> > rewritten to fix this bug.
>> >
>> > One thought (though know its a ton of effort).. we have talked about for a 
>> > long time about moving the reading/writing logic into its jar (so tools 
>> > don’t need cassandra-all and can limit the dependencies)… if we did that 
>> > we could try to solve this as an out of process migration… have the 2.2 
>> > reader then write using 6.0 writer (ignoring compact storage… )…
>> >
>> >> On Dec 11, 2024, at 4:59 AM, Benedict 
>> >> mailto:bened...@apache.org>> wrote:
>> >>
>> >> I think 3.11 supported upgrade from 2.2, but I haven’t checked. I am 
>> >> fairly sure 4.x supported upgrade from 3.0.x also.
>> >>
>> >>
>> >>> On 11 Dec 2024, at 12:53, Miklosovic, Stefan via dev 
>> >>> mailto:dev@cassandra.apache.org>> wrote:
>> >>>
>> >>> I see. That makes sense. I think that by 3.x you meant basically the 
>> >>> latest 3.11, right? I guess 2.2 -> 3.0 already works, we would just try 
>> >>> to support 2.2 -> 3.11 straight away. I need to check where we are at in 
>> >>> that area.
>> >>>
>> >>> 
>> >>> From: Benedict mailto:bened...@apache.org>>
>> >>> Sent: Wednesday, December 11, 2024 13:09
>> >>> To: dev@cassandra.apache.org
>> >>> Cc: Miklosovic, Stefan; 
>> >>> dev@cassandra.apache.org; Miklosovic, 
>> >>> Stefan
>> >>> Subject: Re: Supporting 2.2 -> 5.0 upgrades
>> >>>
>> >>> EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> 2.2 is particularly hard because of the major storage format changes 
>> >>> that took place.
>> >>>
>> >>> I think if we want to retain (restore) upgrade support from 3.x I would 
>> >>> support that, but 2.x is probably too burdensome and likely to have too 
>> >>> many hard edges.
>> >>>
>> >>> I think if users only had to upgrade 2.2->3.x then eg 3.x->6.0 that 
>> >>> would be a pretty friendly upgrade path all things considered.
>> >>>
>>  On 11 Dec 2024, at 12:03, Miklosovic, Stefan via dev 
>>  mailto:dev@cassandra.apache.org>> wrote:
>> 
>>  Hey,
>> 
>>  I want to fork the thread where we are ment

Re: Supporting 2.2 -> 5.0 upgrades

2024-12-12 Thread Alex Petrov
> I have for a while advocated for a shared lib to also share between Harry, 
> accord, dtests etc

Big +1 for a shared lib for our concurrency and test utils. Been intending to 
start working on this for a while now, but never got to do this so far.

On Thu, Dec 12, 2024, at 5:58 PM, Benedict wrote:
> 
> Why would ant get in the way? We already build multiple jars, and accord will 
> be a submodule. We have far more organisational issues to overcome than ant.
> 
> I have for a while advocated for a shared lib to also share between Harry, 
> accord, dtests etc
> 
> I am however not 100% sure about splitting read/write path, at least not as 
> first posited. The idea of maintaining it as an API for dropping in different 
> jars is a whole other world of potential pain I don’t want to countenance. 
> Supporting eg bulk readers or writers or other integrations seems pretty 
> feasible though.
> 
> 
>> On 12 Dec 2024, at 16:53, Paulo Motta  wrote:
>> 
>> >  I think that will not happen until we are out of Ant as doing this multi 
>> > jar / subproject mumbo jumbo is not too much appealing to ... anybody?
>> 
>> This is a contentious/controversial topic, but the more I work with gradle 
>> the more I lean towards ant's simplicity. That said, I'd support moving away 
>> if it becomes a technical blocker to break up cassandra-all - and if this 
>> happen I would vote for maven as replacement. :-D
>> 
>> On Thu, Dec 12, 2024 at 11:42 AM Miklosovic, Stefan via dev 
>>  wrote:
>>> These are all good ideas but in practical terms I think that will not 
>>> happen until we are out of Ant as doing this multi jar / subproject mumbo 
>>> jumbo is not too much appealing to ... anybody?
>>> 
>>> 
>>> From: Paulo Motta 
>>> Sent: Thursday, December 12, 2024 17:35
>>> To: dev@cassandra.apache.org
>>> Subject: Re: Supporting 2.2 -> 5.0 upgrades
>>> 
>>> EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
>>> 
>>> 
>>> 
>>> >  +1 on moving the read/write logic into its own jar.
>>> 
>>> +1, not only read-write logic but anything used by both the server and 
>>> subprojects (ie. cassandra-sidecar), for example JMX Mbeans and other 
>>> interfaces.
>>> 
>>> I think one way to do that would be to split cassandra-all into 
>>> cassandra-server and cassandra-common (anything used by both subprojects 
>>> and server), but not sure if this would be feasible or what it would take.
>>> 
>>> If there's loose agreement this would be a feasible path I'd be happy to 
>>> create a JIRA to investigate what this would take.
>>> 
>>> On Thu, Dec 12, 2024 at 11:26 AM Doug Rohrer 
>>> mailto:droh...@apple.com>> wrote:
>>> +1 on moving the read/write logic into its own jar.
>>> 
>>> Doug
>>> 
>>> > On Dec 11, 2024, at 7:21 PM, David Capwell 
>>> > mailto:dcapw...@apple.com>> wrote:
>>> >
>>> > From a disk format point of view the only thing I remember was the disk 
>>> > type bug with UDTs.  Bringing that logic back was hard as the type system 
>>> > (in 5.0) tries to avoid allowing construction of invalid states, and we 
>>> > would need to weaken that in order to enable the migration. Assuming the 
>>> > user migrated from 3.x to 4.x then the sstable metadata should have been 
>>> > rewritten to fix this bug.
>>> >
>>> > One thought (though know its a ton of effort).. we have talked about for 
>>> > a long time about moving the reading/writing logic into its jar (so tools 
>>> > don’t need cassandra-all and can limit the dependencies)… if we did that 
>>> > we could try to solve this as an out of process migration… have the 2.2 
>>> > reader then write using 6.0 writer (ignoring compact storage… )…
>>> >
>>> >> On Dec 11, 2024, at 4:59 AM, Benedict 
>>> >> mailto:bened...@apache.org>> wrote:
>>> >>
>>> >> I think 3.11 supported upgrade from 2.2, but I haven’t checked. I am 
>>> >> fairly sure 4.x supported upgrade from 3.0.x also.
>>> >>
>>> >>
>>> >>> On 11 Dec 2024, at 12:53, Miklosovic, Stefan via dev 
>>> >>> mailto:dev@cassandra.apache.org>> wrote:
>>> >>>
>>> >>> I see. That makes sense. I think that by 3.x you meant basically the 
>>> >>> latest 3.11, right? I guess 2.2 -> 3.0 already works, we would just try 
>>> >>> to support 2.2 -> 3.11 straight away. I need to check where we are at 
>>> >>> in that area.
>>> >>>
>>> >>> 
>>> >>> From: Benedict mailto:bened...@apache.org>>
>>> >>> Sent: Wednesday, December 11, 2024 13:09
>>> >>> To: dev@cassandra.apache.org
>>> >>> Cc: Miklosovic, Stefan; 
>>> >>> dev@cassandra.apache.org; Miklosovic, 
>>> >>> Stefan
>>> >>> Subject: Re: Supporting 2.2 -> 5.0 upgrades
>>> >>>
>>> >>> EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> 2.2 is particularly hard because of the major storage format changes 
>>> >>> that took place.
>>> >>>
>>> >>> I think if we want to retain (restore) upgrade sup

Re: 2024 year in review

2024-12-12 Thread Miklosovic, Stefan via dev
Hey,

these are interesting metrics when it comes to the number of commits for an 
individual like mentioned in that list you compiled, but I want to emphasize 
that the way I look at it is that it really just means how big "throughput" the 
project has when it comes to how many commits they can make happen.

There is a limited pool of committers and reviewers and if few people have 
dis-proportionally large number of things to review and tickets to be done or 
doing infrastructural work like releases or they merge up a lot etc., then it 
is natural that these people will have a lot of commits.

However, there are people which have way bigger impact on the overall success 
of the project so counting the commits and measuring how one is "successful" 
(if somebody ever tried to do that) does not in practice mean anything.

Regards


From: Josh McKenzie 
Sent: Wednesday, December 11, 2024 19:45
To: dev
Subject: 2024 year in review

EXTERNAL EMAIL - USE CAUTION when clicking links or attachments



It's been a long time since I sent out a status update. Let's round up 2024, 
and let's see if we can't have more regular updates in 2025 shall we?

First, some vanity metrics on the year in review:

Community Health and Activity:
-  Tickets created: 840
- Tickets fixed: 518
- Tickets created in 2024 and closed: 463
- Commits to trunk: 343 (excluding merge commits)
- `git diff --stat` against 1st commit in 2024:  1549 files changed, 77170 
insertions(+), 28184 deletions(-)
- 42 unique committers committed to the project (compared with 47 in both 2022 
and 2023), with our top 4 being:
- Stefan Miklosovic at 172 commits
- Brandon Williams at 117
- Mick Semb Wever at 98
- Caleb Rackliffe at 63

Releases:
- 4.0 line: 4 releases
- 4.1 line: 4 releases
- 5.0 line: 3 releases

We deprecated the 3.0 and 3.X line, which was a huge improvement in QoL for 
merging patches for all of us.

Email:
- 230 topics on the dev@ list
- 122 topics on the user@ list

Slack:
- 1949 members in #cassandra on the-asf.slack.com
- 869 members in #cassandra-dev on the-asf.slack.com

We've had a solid year for progress, engagement, discussion, and diversity when 
it comes to the project.
-
So that's the Ghost of Cassandra Past. What about the present?

First off, as always, how can you get involved if you're looking to contribute? 
Follow this link: 
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2162&quickFilter=2160&quickFilter=2652
This will give you a list of all the relatively low complexity tickets that 
currently lack an assignee.

You can find us on the ASF slack: 
https://the-asf.slack.com,
 in #cassandra-dev for dev discussion and #cassandra for user discussion. If 
you need an invite to the slack server, let me know and I'll get you setup.

For the code, let's look at the present workflow on trunk / our next major:
- 
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484

I've updated our project release kanban board to be aligned with the current 
branches and workflows. Of note, just over 50% of the work flagged as in flight 
on the project is on trunk. I went ahead and bumped the 4 in "Awaiting 
Feedback" back to Triage since they were sitting for months to years at a time 
and contributor questions on the tickets had been answered.

We currently have 35 tickets on trunk that need review; any committers with 
some spare cycles, your help there would be greatly appreciated; I'll try and 
kick out another email for us to [DISCUSS] this as it's a longstanding struggle 
for us. I don't have longitudinal data for the past year or two on this metric, 
but having 90 tickets project-wide, 35 on trunk, that are blocked waiting on 
review feels high to me.

We have 11 tickets project-wide that are in "Needs Committer" state, 8 of which 
are on trunk.

All in all, things feel like they're in a pretty healthy balanced spot heading 
into the last stretch of the holiday season.

-

That's the present. What about the Future? Let's take a look from a "next major 
release perspective."

We released 5.0 GA on September 5th, 2024. If we target a release 12 mon

Re: [DISCUSS] Experimental flagging (fork from Re-evaluate compaction defaults in 5.1/trunk)

2024-12-12 Thread Josh McKenzie
> But we also need an approved non-euphemism for features like MVs (I suggest 
> ‘broken’) and possibly a softer version of it ('dangerous') for our existing 
> features that work fine in some narrow well-defined circumstances but will 
> blow in your face if you don’t know exactly what you are doing.
Feels like the real answer is:
 1. Endeavor to never get ourselves into this state
 2. Take immediate action if we discover we're there (fix feature if possible, 
deprecate and remove if not). Not "leave to fester for years"
I like the introduction of 'alpha' as an alias for 'Preview'; not sure why that 
wasn't what we immediately came up with collectively given how widespread its 
usage is. :)

What would demoting MV's to 'alpha' right now look like? We'd warn on their 
usage w/some different structure and verbiage, and it'd be pretty implicitly 
clear to people they shouldn't use it in production right?

It seems to me that the 3 categories would be sufficient even to handle our 
current scenario where we have some things in the system that are a Bad Idea to 
use in production.

On Thu, Dec 12, 2024, at 6:06 AM, Aleksey Yeshchenko wrote:
> I don’t like ‘unstable’ either, albeit for a different reason, but I don’t 
> think three is enough and fits, as we already have some features that don’t 
> fit into either of (preview,beta,ga) - released but broken, released but 
> dangerous, deprecated, removed.
> 
> For new features going forward, alpha (preview) -> beta -> GA works well 
> enough.
> 
> But we also need an approved non-euphemism for features like MVs (I suggest 
> ‘broken’) and possibly a softer version of it ('dangerous') for our existing 
> features that work fine in some narrow well-defined circumstances but will 
> blow in your face if you don’t know exactly what you are doing.
> 
> These classifications are largely orthogonal.
> 
> Alpha(preview)->Beta->GA communicates readiness of a feature under 
> development, with GA being the default final state for most features.
> 
> From there a feature can transition into ‘broken’ or ‘dangerous’ territory. 
> Serious issues get uncovered (very) late sometimes. It is what it is.
> And we do deprecate and remove functionality when it’s superseded.
> 
> 
>> -1 on unstable. It's way too many words than are needed. Three is a
>> magic number and fits:
>> 
>> Preview
>> Beta
>> GA
> 
>> On 11 Dec 2024, at 18:50, Josh McKenzie  wrote:
>> 
>> A structured, disciplined approach to graduating something from [Optional] 
>> -> [Default] makes sense to me, similar to how we're talking about a 
>> structured flow of [Preview] -> [Beta] -> [GA]. Having those clear stages 
>> gives us a framework to define what requirements of stage transitions would 
>> be which'll ideally lead to us producing higher quality, more predictable, 
>> more consistent results for our end users.
>> 
>> For instance, requirements from [Optional] -> [Default] could be higher 
>> level abstractions like:
>>  • Confidence in stability
>>  • Strong evidence to indicate superiority in majority of workloads (by 
>> count or importance or size, etc)
>> These are all things we kind of do implicitly and ad-hoc on the mailing 
>> list, and I'm not looking to tie us down to any granular structure or 
>> specificity. More thinking it could be useful for someone that's worked on 
>> something who wonders "Huh. How do I take this from being optional to the 
>> default?" and having an answer better than "reinvent the wheel every time 
>> and fling spaghetti at the dev list and pray".
>> 
>> :)
>> 
>> 
>> On Wed, Dec 11, 2024, at 1:04 PM, Paulo Motta wrote:
>>> Thanks for bringing up this topic, Josh. 
>>> 
>>> Outside of the major features (ie. MV/SAI/TCM/Accord), one related 
>>> discussion in this topic is: how can we "promote" small improvements in 
>>> existing features from optional to default ?
>>> 
>>> It makes sense to have optimizations launched behind a feature flag 
>>> initially (beta phase) while the improvement gets real world exposure, but 
>>> I think we need a better way to promote these optimizations to default 
>>> behavior on a regular cadence.
>>> 
>>> Take for example optimized repairs from CASSANDRA-16274. It was launched in 
>>> 4.x as an optional feature gated behind a flag, ie. 
>>> auto_optimise_full_repair_streams: false. 
>>> 
>>> I could be easily missing something, but is there a world where 
>>> non-optimized repairs make sense once this optimization is proven to work ? 
>>> I agree this is fine while the feature is maturing, but at some point we 
>>> need to rip the bandaid and make the optimization default (and clearly 
>>> communicate that). This would allow cleanup code toil of default behavior 
>>> that is no longer being used, because everyone is enabling the improvement 
>>> during deployment.
>>> 
>>> This is just one example to demonstrate the issue and I don't want this 
>>> discussion to focus on this particular case, but I can think of other 
>>> improvements launc

Re: [DISCUSS] 5.1 should be 6.0

2024-12-12 Thread Jeremiah Jordan
 My expectation is that in trunk SCM CASSANDRA_4 would change to SCM
CASSANDRA_5.  I think we should be striving to support full
downgrade/rollback ability to the previous major version from trunk.
With TCM I would expect that when running in CASSANDRA_5 mode that
initializing TCM would not be possible, as once initialized you could no
longer roll back.
Do we have no way to support the gossip paths continuing to work prior to
initializing TCM?

-Jeremiah

On Dec 11, 2024 at 7:41:48 AM, Sam Tunnicliffe  wrote:

> My point is that the upgrade to 5.1/6.0 isn't really complete until the
> CMS is initialised and this can't be done while running with SCM
> CASSANDRA_4 because of the messaging service limitation. Until that point,
> schema changes & node replacements are not supported which affects how long
> a bake time is tolerable.
> This specific issue could probably be fixed by revisiting the SCM
> implementation in 5.1/6.0, so we should certainly do that but the fact
> remains that we don't have great test coverage to indicate how clusters
> behave when running in SCM for a prolonged period.
>
> Thanks,
> Sam
>
> On 11 Dec 2024, at 13:29, Brandon Williams  wrote:
>
>
> On Wed, Dec 11, 2024 at 7:22 AM Sam Tunnicliffe  wrote:
>
> >
>
> > so running in any SCM mode for a prolonged period is not really viable.
>
>
> This is what many users want to do though, upgrade one DC and let it
>
> bake to see how it goes before continuing.  I don't think that's
>
> unreasonable, but from working on CASSANDRA-20118 I know how difficult
>
> that is already.  I don't think we've built enough SCM muscle yet to
>
> think about handling multiple previous versions.
>
>
> Kind Regards,
>
> Brandon
>
>
>


Re: Supporting 2.2 -> 5.0 upgrades

2024-12-12 Thread Doug Rohrer
+1 on moving the read/write logic into its own jar.

Doug

> On Dec 11, 2024, at 7:21 PM, David Capwell  wrote:
> 
> From a disk format point of view the only thing I remember was the disk type 
> bug with UDTs.  Bringing that logic back was hard as the type system (in 5.0) 
> tries to avoid allowing construction of invalid states, and we would need to 
> weaken that in order to enable the migration. Assuming the user migrated from 
> 3.x to 4.x then the sstable metadata should have been rewritten to fix this 
> bug.
> 
> One thought (though know its a ton of effort).. we have talked about for a 
> long time about moving the reading/writing logic into its jar (so tools don’t 
> need cassandra-all and can limit the dependencies)… if we did that we could 
> try to solve this as an out of process migration… have the 2.2 reader then 
> write using 6.0 writer (ignoring compact storage… )… 
> 
>> On Dec 11, 2024, at 4:59 AM, Benedict  wrote:
>> 
>> I think 3.11 supported upgrade from 2.2, but I haven’t checked. I am fairly 
>> sure 4.x supported upgrade from 3.0.x also.
>> 
>> 
>>> On 11 Dec 2024, at 12:53, Miklosovic, Stefan via dev 
>>>  wrote:
>>> 
>>> I see. That makes sense. I think that by 3.x you meant basically the 
>>> latest 3.11, right? I guess 2.2 -> 3.0 already works, we would just try to 
>>> support 2.2 -> 3.11 straight away. I need to check where we are at in that 
>>> area.
>>> 
>>> 
>>> From: Benedict 
>>> Sent: Wednesday, December 11, 2024 13:09
>>> To: dev@cassandra.apache.org
>>> Cc: Miklosovic, Stefan; dev@cassandra.apache.org; Miklosovic, Stefan
>>> Subject: Re: Supporting 2.2 -> 5.0 upgrades
>>> 
>>> EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
>>> 
>>> 
>>> 
>>> 
>>> 2.2 is particularly hard because of the major storage format changes that 
>>> took place.
>>> 
>>> I think if we want to retain (restore) upgrade support from 3.x I would 
>>> support that, but 2.x is probably too burdensome and likely to have too 
>>> many hard edges.
>>> 
>>> I think if users only had to upgrade 2.2->3.x then eg 3.x->6.0 that would 
>>> be a pretty friendly upgrade path all things considered.
>>> 
 On 11 Dec 2024, at 12:03, Miklosovic, Stefan via dev 
  wrote:
 
 Hey,
 
 I want to fork the thread where we are mentioning that 2.2 -> 5.0 would be 
 cool to support.
 
 I was involved in checking that offline upgrades from 3.0 to 5.0 work and 
 fixed few issues along the way (1), hence I can imagine that supporting 
 2.2 -> 5.0 would be basically the same thing just on steroids and more 
 involved? Anyway, having a stab into this is not useless at all, I will at 
 least go deep into the upgrade stuff I have never given a lot of thought 
 to which is good learning experience.
 
 Any tips where to start? Was any progress done by anybody already in this 
 matter to not start from zero?
 
 (1) 
 https://urldefense.com/v3/__https://issues.apache.org/jira/browse/CASSANDRA-19002__;!!Nhn8V6BzJA!RFZoz6sQSrP_qLd0K_eNWO3UAc1s8mTT5SkFalUMwM7_l9gWfb4cnfTFvdY68zsh5-REW7T8ALTPQwqMM_gWWSyp$
 
 Regards
>>> 
>> 
> 



Re: Supporting 2.2 -> 5.0 upgrades

2024-12-12 Thread Paulo Motta
>  +1 on moving the read/write logic into its own jar.

+1, not only read-write logic but anything used by both the server and
subprojects (ie. cassandra-sidecar), for example JMX Mbeans and other
interfaces.

I think one way to do that would be to split cassandra-all into
cassandra-server and cassandra-common (anything used by both subprojects
and server), but not sure if this would be feasible or what it would take.

If there's loose agreement this would be a feasible path I'd be happy to
create a JIRA to investigate what this would take.

On Thu, Dec 12, 2024 at 11:26 AM Doug Rohrer  wrote:

> +1 on moving the read/write logic into its own jar.
>
> Doug
>
> > On Dec 11, 2024, at 7:21 PM, David Capwell  wrote:
> >
> > From a disk format point of view the only thing I remember was the disk
> type bug with UDTs.  Bringing that logic back was hard as the type system
> (in 5.0) tries to avoid allowing construction of invalid states, and we
> would need to weaken that in order to enable the migration. Assuming the
> user migrated from 3.x to 4.x then the sstable metadata should have been
> rewritten to fix this bug.
> >
> > One thought (though know its a ton of effort).. we have talked about for
> a long time about moving the reading/writing logic into its jar (so tools
> don’t need cassandra-all and can limit the dependencies)… if we did that we
> could try to solve this as an out of process migration… have the 2.2 reader
> then write using 6.0 writer (ignoring compact storage… )…
> >
> >> On Dec 11, 2024, at 4:59 AM, Benedict  wrote:
> >>
> >> I think 3.11 supported upgrade from 2.2, but I haven’t checked. I am
> fairly sure 4.x supported upgrade from 3.0.x also.
> >>
> >>
> >>> On 11 Dec 2024, at 12:53, Miklosovic, Stefan via dev <
> dev@cassandra.apache.org> wrote:
> >>>
> >>> I see. That makes sense. I think that by 3.x you meant basically the
> latest 3.11, right? I guess 2.2 -> 3.0 already works, we would just try to
> support 2.2 -> 3.11 straight away. I need to check where we are at in that
> area.
> >>>
> >>> 
> >>> From: Benedict 
> >>> Sent: Wednesday, December 11, 2024 13:09
> >>> To: dev@cassandra.apache.org
> >>> Cc: Miklosovic, Stefan; dev@cassandra.apache.org; Miklosovic, Stefan
> >>> Subject: Re: Supporting 2.2 -> 5.0 upgrades
> >>>
> >>> EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
> >>>
> >>>
> >>>
> >>>
> >>> 2.2 is particularly hard because of the major storage format changes
> that took place.
> >>>
> >>> I think if we want to retain (restore) upgrade support from 3.x I
> would support that, but 2.x is probably too burdensome and likely to have
> too many hard edges.
> >>>
> >>> I think if users only had to upgrade 2.2->3.x then eg 3.x->6.0 that
> would be a pretty friendly upgrade path all things considered.
> >>>
>  On 11 Dec 2024, at 12:03, Miklosovic, Stefan via dev <
> dev@cassandra.apache.org> wrote:
> 
>  Hey,
> 
>  I want to fork the thread where we are mentioning that 2.2 -> 5.0
> would be cool to support.
> 
>  I was involved in checking that offline upgrades from 3.0 to 5.0 work
> and fixed few issues along the way (1), hence I can imagine that supporting
> 2.2 -> 5.0 would be basically the same thing just on steroids and more
> involved? Anyway, having a stab into this is not useless at all, I will at
> least go deep into the upgrade stuff I have never given a lot of thought to
> which is good learning experience.
> 
>  Any tips where to start? Was any progress done by anybody already in
> this matter to not start from zero?
> 
>  (1)
> https://urldefense.com/v3/__https://issues.apache.org/jira/browse/CASSANDRA-19002__;!!Nhn8V6BzJA!RFZoz6sQSrP_qLd0K_eNWO3UAc1s8mTT5SkFalUMwM7_l9gWfb4cnfTFvdY68zsh5-REW7T8ALTPQwqMM_gWWSyp$
> 
>  Regards
> >>>
> >>
> >
>
>


Re: Supporting 2.2 -> 5.0 upgrades

2024-12-12 Thread Miklosovic, Stefan via dev
These are all good ideas but in practical terms I think that will not happen 
until we are out of Ant as doing this multi jar / subproject mumbo jumbo is not 
too much appealing to ... anybody?


From: Paulo Motta 
Sent: Thursday, December 12, 2024 17:35
To: dev@cassandra.apache.org
Subject: Re: Supporting 2.2 -> 5.0 upgrades

EXTERNAL EMAIL - USE CAUTION when clicking links or attachments



>  +1 on moving the read/write logic into its own jar.

+1, not only read-write logic but anything used by both the server and 
subprojects (ie. cassandra-sidecar), for example JMX Mbeans and other 
interfaces.

I think one way to do that would be to split cassandra-all into 
cassandra-server and cassandra-common (anything used by both subprojects and 
server), but not sure if this would be feasible or what it would take.

If there's loose agreement this would be a feasible path I'd be happy to create 
a JIRA to investigate what this would take.

On Thu, Dec 12, 2024 at 11:26 AM Doug Rohrer 
mailto:droh...@apple.com>> wrote:
+1 on moving the read/write logic into its own jar.

Doug

> On Dec 11, 2024, at 7:21 PM, David Capwell 
> mailto:dcapw...@apple.com>> wrote:
>
> From a disk format point of view the only thing I remember was the disk type 
> bug with UDTs.  Bringing that logic back was hard as the type system (in 5.0) 
> tries to avoid allowing construction of invalid states, and we would need to 
> weaken that in order to enable the migration. Assuming the user migrated from 
> 3.x to 4.x then the sstable metadata should have been rewritten to fix this 
> bug.
>
> One thought (though know its a ton of effort).. we have talked about for a 
> long time about moving the reading/writing logic into its jar (so tools don’t 
> need cassandra-all and can limit the dependencies)… if we did that we could 
> try to solve this as an out of process migration… have the 2.2 reader then 
> write using 6.0 writer (ignoring compact storage… )…
>
>> On Dec 11, 2024, at 4:59 AM, Benedict 
>> mailto:bened...@apache.org>> wrote:
>>
>> I think 3.11 supported upgrade from 2.2, but I haven’t checked. I am fairly 
>> sure 4.x supported upgrade from 3.0.x also.
>>
>>
>>> On 11 Dec 2024, at 12:53, Miklosovic, Stefan via dev 
>>> mailto:dev@cassandra.apache.org>> wrote:
>>>
>>> I see. That makes sense. I think that by 3.x you meant basically the 
>>> latest 3.11, right? I guess 2.2 -> 3.0 already works, we would just try to 
>>> support 2.2 -> 3.11 straight away. I need to check where we are at in that 
>>> area.
>>>
>>> 
>>> From: Benedict mailto:bened...@apache.org>>
>>> Sent: Wednesday, December 11, 2024 13:09
>>> To: dev@cassandra.apache.org
>>> Cc: Miklosovic, Stefan; 
>>> dev@cassandra.apache.org; Miklosovic, 
>>> Stefan
>>> Subject: Re: Supporting 2.2 -> 5.0 upgrades
>>>
>>> EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
>>>
>>>
>>>
>>>
>>> 2.2 is particularly hard because of the major storage format changes that 
>>> took place.
>>>
>>> I think if we want to retain (restore) upgrade support from 3.x I would 
>>> support that, but 2.x is probably too burdensome and likely to have too 
>>> many hard edges.
>>>
>>> I think if users only had to upgrade 2.2->3.x then eg 3.x->6.0 that would 
>>> be a pretty friendly upgrade path all things considered.
>>>
 On 11 Dec 2024, at 12:03, Miklosovic, Stefan via dev 
 mailto:dev@cassandra.apache.org>> wrote:

 Hey,

 I want to fork the thread where we are mentioning that 2.2 -> 5.0 would be 
 cool to support.

 I was involved in checking that offline upgrades from 3.0 to 5.0 work and 
 fixed few issues along the way (1), hence I can imagine that supporting 
 2.2 -> 5.0 would be basically the same thing just on steroids and more 
 involved? Anyway, having a stab into this is not useless at all, I will at 
 least go deep into the upgrade stuff I have never given a lot of thought 
 to which is good learning experience.

 Any tips where to start? Was any progress done by anybody already in this 
 matter to not start from zero?

 (1) 
 https://urldefense.com/v3/__https://issues.apache.org/jira/browse/CASSANDRA-19002__;!!Nhn8V6BzJA!RFZoz6sQSrP_qLd0K_eNWO3UAc1s8mTT5SkFalUMwM7_l9gWfb4cnfTFvdY68zsh5-REW7T8ALTPQwqMM_gWWSyp$

 Regards
>>>
>>
>



Re: [DISCUSS] 5.1 should be 6.0

2024-12-12 Thread David Capwell
> My expectation is that in trunk SCM CASSANDRA_4 would change to SCM 
> CASSANDRA_5.

Assuming you upgrade from 4.0 to 5.0, then you are running on CASSANDRA_4… how 
many people know that they are expected to do something about that (Sam 
documented the steps earlier)?  What if you leave things alone and try to 
upgrade to 5.1/6.0… now what?

What about users who create a new 5.0 cluster… we still default to 
compatibility mode in this case, so a new 5.0 cluster is running with 
CASSANDRA_4…

> This is why I want to remove the coupling between SCM and messaging version.

Feels like we just had a similar conversation Sam with regard to TCM / Accord ;)

I don’t see messaging version as the problem, as I feel that messaging and disk 
versions are intertwined and cause this confusion (they are the same 
serializers)… If we are running with messaging version VERSION_40, why does it 
matter if we write to disk with VERSION_40 or VERSION_50?  If we want downgrade 
we should block _50 and only use _40 for disk, but why should networking not be 
allowed to do _50?  What we write to disk impacts our ability to downgrade, and 
messaging already has an ability to downgrade its version if its peers don’t 
know the latest version.

In short I agree with you Sam, we should decouple… I think it makes sense for 
SCM to control the version we use for disk, but not networking… 

> On Dec 12, 2024, at 8:46 AM, Sam Tunnicliffe  wrote:
> 
> No, we initially tried to preserve all the previous paths and put the whole 
> thing behind a feature flag, but it was just way too pervasive and doing so 
> would've added years to the project. So for the period before the CMS is 
> initialized, certain operations are not available. 
> 
> However, it should be entirely possible to downgrade and rollback to 5.0 
> after cutting over to TCM, as long as SSTables are still in the old format. 
> By "should be" I mean it is absolutely possible and has been tested, but it 
> requires the SCM to guard the on disk format, which has the unfortunate 
> effect of limiting the messaging version and that in turn make it impossible 
> to actually cut over to TCM. i.e. the testing has been done with a patch 
> which disables some things which rely on messaging VERSION_51. This is why I 
> want to remove the coupling between SCM and messaging version.
> 
> Also, I misspoke slightly in my previous email because I forgot that we did 
> manage to enable a decent subsection of TCM to work with 
> VERSION_40/VERSION_50. In this scenario, you still get the linearized schema 
> updates via the metadata log but replicas/coordinators don't exchange epochs 
> during reads/writes so the consistency guarantees are weakened.
> 
> Thanks,
> Sam
> 
> 
>> On 12 Dec 2024, at 16:17, Jeremiah Jordan  wrote:
>> 
>> My expectation is that in trunk SCM CASSANDRA_4 would change to SCM 
>> CASSANDRA_5.  I think we should be striving to support full 
>> downgrade/rollback ability to the previous major version from trunk.
>> With TCM I would expect that when running in CASSANDRA_5 mode that 
>> initializing TCM would not be possible, as once initialized you could no 
>> longer roll back.
>> Do we have no way to support the gossip paths continuing to work prior to 
>> initializing TCM?
>> 
>> -Jeremiah
>> 
>> On Dec 11, 2024 at 7:41:48 AM, Sam Tunnicliffe  wrote:
>>> My point is that the upgrade to 5.1/6.0 isn't really complete until the CMS 
>>> is initialised and this can't be done while running with SCM CASSANDRA_4 
>>> because of the messaging service limitation. Until that point, schema 
>>> changes & node replacements are not supported which affects how long a bake 
>>> time is tolerable. 
>>> This specific issue could probably be fixed by revisiting the SCM 
>>> implementation in 5.1/6.0, so we should certainly do that but the fact 
>>> remains that we don't have great test coverage to indicate how clusters 
>>> behave when running in SCM for a prolonged period.  
>>> 
>>> Thanks, 
>>> Sam
>>> 
 On 11 Dec 2024, at 13:29, Brandon Williams  wrote:
 
 On Wed, Dec 11, 2024 at 7:22 AM Sam Tunnicliffe  wrote:
> 
> so running in any SCM mode for a prolonged period is not really viable.
 
 This is what many users want to do though, upgrade one DC and let it
 bake to see how it goes before continuing.  I don't think that's
 unreasonable, but from working on CASSANDRA-20118 I know how difficult
 that is already.  I don't think we've built enough SCM muscle yet to
 think about handling multiple previous versions.
 
 Kind Regards,
 Brandon
>>> 
> 



Re: [DISCUSS] Experimental flagging (fork from Re-evaluate compaction defaults in 5.1/trunk)

2024-12-12 Thread Patrick McFadin
We already have an established "alpha," IMO, and it's called branch
and trunk. For example, the CEP-15 branch is the alpha for Accord and
TCM, and then it will be merged into trunk. The next stop is beta and
on to the regular release train.

I'm just optimizing to keep it simple and clean for end users. Less options.

Patrick

On Thu, Dec 12, 2024 at 7:57 AM Josh McKenzie  wrote:
>
> But MVs are not alpha or preview, as they are not actively being worked on. 
> They are currently broken. Calling them ‘alpha’ makes ‘alpha’ overloaded and 
> less useful.
>
> I'm asserting they should either be marked 'alpha/Preview' and actively being 
> worked on, or be deprecated and removed.
>
> To Paulo's point, Jaydeep's working on base<->view reconciliation which, to 
> me, would indicate they're alpha. And until we have a solution for that 
> reconciliation in the face of lost data in the base table, no global index or 
> MV implementation will be ready for production.
>
> On Thu, Dec 12, 2024, at 10:40 AM, Brandon Williams wrote:
>
> I think it would be better to avoid 'alpha' since we do beta releases,
> and I agree with Aleksey that we'd be overloading 'alpha' and perhaps
> causing confusion.
>
> Kind Regards,
> Brandon
>
> On Thu, Dec 12, 2024 at 8:58 AM Benedict  wrote:
> >
> > I think alpha is fine. It communicates fairly well that there’s no near 
> > term expectation they will be production capable.
> >
> > There is (I think) still an intention to improve them, but they are janky. 
> > If we don’t intend to begin fixing the feature within the next year or so 
> > we should deprecate it entirely.
> >
> >
> > On 12 Dec 2024, at 14:46, Aleksey Yeshchenko  wrote:
> >
> > But MVs are not alpha or preview, as they are not actively being worked 
> > on. They are currently broken. Calling them ‘alpha’ makes ‘alpha’ 
> > overloaded and less useful.
> >
> > On 12 Dec 2024, at 14:00, Josh McKenzie  wrote:
> >
> > But we also need an approved non-euphemism for features like MVs (I suggest 
> > ‘broken’) and possibly a softer version of it ('dangerous') for our 
> > existing features that work fine in some narrow well-defined circumstances 
> > but will blow in your face if you don’t know exactly what you are doing.
> >
> > Feels like the real answer is:
> >
> > Endeavor to never get ourselves into this state
> > Take immediate action if we discover we're there (fix feature if possible, 
> > deprecate and remove if not). Not "leave to fester for years"
> >
> > I like the introduction of 'alpha' as an alias for 'Preview'; not sure why 
> > that wasn't what we immediately came up with collectively given how 
> > widespread its usage is. :)
> >
> > What would demoting MV's to 'alpha' right now look like? We'd warn on their 
> > usage w/some different structure and verbiage, and it'd be pretty 
> > implicitly clear to people they shouldn't use it in production right?
> >
> > It seems to me that the 3 categories would be sufficient even to handle our 
> > current scenario where we have some things in the system that are a Bad 
> > Idea to use in production.
> >
> > On Thu, Dec 12, 2024, at 6:06 AM, Aleksey Yeshchenko wrote:
> >
> > I don’t like ‘unstable’ either, albeit for a different reason, but I don’t 
> > think three is enough and fits, as we already have some features that don’t 
> > fit into either of (preview,beta,ga) - released but broken, released but 
> > dangerous, deprecated, removed.
> >
> > For new features going forward, alpha (preview) -> beta -> GA works well 
> > enough.
> >
> > But we also need an approved non-euphemism for features like MVs (I suggest 
> > ‘broken’) and possibly a softer version of it ('dangerous') for our 
> > existing features that work fine in some narrow well-defined circumstances 
> > but will blow in your face if you don’t know exactly what you are doing.
> >
> > These classifications are largely orthogonal.
> >
> > Alpha(preview)->Beta->GA communicates readiness of a feature under 
> > development, with GA being the default final state for most features.
> >
> > From there a feature can transition into ‘broken’ or ‘dangerous’ territory. 
> > Serious issues get uncovered (very) late sometimes. It is what it is.
> > And we do deprecate and remove functionality when it’s superseded.
> >
> >
> > -1 on unstable. It's way too many words than are needed. Three is a
> > magic number and fits:
> >
> > Preview
> > Beta
> > GA
> >
> >
> > On 11 Dec 2024, at 18:50, Josh McKenzie  wrote:
> >
> > A structured, disciplined approach to graduating something from [Optional] 
> > -> [Default] makes sense to me, similar to how we're talking about a 
> > structured flow of [Preview] -> [Beta] -> [GA]. Having those clear stages 
> > gives us a framework to define what requirements of stage transitions would 
> > be which'll ideally lead to us producing higher quality, more predictable, 
> > more consistent results for our end users.
> >
> > For instance, requirements from [Optional] -> [Default] could be

Re: Supporting 2.2 -> 5.0 upgrades

2024-12-12 Thread Štefan Miklošovič
btw if you think about that ... if we ever felt some strong urge to move
from Ant we could not overcome and we had code split to more modules /
subprojects already, then moving such an already-splitted project from Ant
to whatever else would be just an exercise. It would be like "OK we split
Cassandra's codebase to 10 modules" so moving that to Maven or whatever
would be just about what? About writing poms and wiring that together.

I suggest we try to make a simple module by extracting some functionality
to see how it would look like, gain the experience to do that and iterate.
I can imagine untying all the guts so we can ship that as separate jar will
be the work in itself (and most probably the majority of that).

My main concern was to not complicate it further and rather to start from
scratch but if we take an iterative process like that, that might also make
it easier to move away eventually if we think it is the right move as we
would have split it already anyway.

On Thu, Dec 12, 2024 at 6:01 PM Benedict  wrote:

> Why would ant get in the way? We already build multiple jars, and accord
> will be a submodule. We have far more organisational issues to overcome
> than ant.
>
> I have for a while advocated for a shared lib to also share between Harry,
> accord, dtests etc
>
> I am however not 100% sure about splitting read/write path, at least not
> as first posited. The idea of maintaining it as an API for dropping in
> different jars is a whole other world of potential pain I don’t want to
> countenance. Supporting eg bulk readers or writers or other integrations
> seems pretty feasible though.
>
> On 12 Dec 2024, at 16:53, Paulo Motta  wrote:
>
> 
> >  I think that will not happen until we are out of Ant as doing this
> multi jar / subproject mumbo jumbo is not too much appealing to ... anybody?
>
> This is a contentious/controversial topic, but the more I work with gradle
> the more I lean towards ant's simplicity. That said, I'd support moving
> away if it becomes a technical blocker to break up cassandra-all - and if
> this happen I would vote for maven as replacement. :-D
>
> On Thu, Dec 12, 2024 at 11:42 AM Miklosovic, Stefan via dev <
> dev@cassandra.apache.org> wrote:
>
>> These are all good ideas but in practical terms I think that will not
>> happen until we are out of Ant as doing this multi jar / subproject mumbo
>> jumbo is not too much appealing to ... anybody?
>>
>> 
>> From: Paulo Motta 
>> Sent: Thursday, December 12, 2024 17:35
>> To: dev@cassandra.apache.org
>> Subject: Re: Supporting 2.2 -> 5.0 upgrades
>>
>> EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
>>
>>
>>
>> >  +1 on moving the read/write logic into its own jar.
>>
>> +1, not only read-write logic but anything used by both the server and
>> subprojects (ie. cassandra-sidecar), for example JMX Mbeans and other
>> interfaces.
>>
>> I think one way to do that would be to split cassandra-all into
>> cassandra-server and cassandra-common (anything used by both subprojects
>> and server), but not sure if this would be feasible or what it would take.
>>
>> If there's loose agreement this would be a feasible path I'd be happy to
>> create a JIRA to investigate what this would take.
>>
>> On Thu, Dec 12, 2024 at 11:26 AM Doug Rohrer > droh...@apple.com>> wrote:
>> +1 on moving the read/write logic into its own jar.
>>
>> Doug
>>
>> > On Dec 11, 2024, at 7:21 PM, David Capwell > dcapw...@apple.com>> wrote:
>> >
>> > From a disk format point of view the only thing I remember was the disk
>> type bug with UDTs.  Bringing that logic back was hard as the type system
>> (in 5.0) tries to avoid allowing construction of invalid states, and we
>> would need to weaken that in order to enable the migration. Assuming the
>> user migrated from 3.x to 4.x then the sstable metadata should have been
>> rewritten to fix this bug.
>> >
>> > One thought (though know its a ton of effort).. we have talked about
>> for a long time about moving the reading/writing logic into its jar (so
>> tools don’t need cassandra-all and can limit the dependencies)… if we did
>> that we could try to solve this as an out of process migration… have the
>> 2.2 reader then write using 6.0 writer (ignoring compact storage… )…
>> >
>> >> On Dec 11, 2024, at 4:59 AM, Benedict > bened...@apache.org>> wrote:
>> >>
>> >> I think 3.11 supported upgrade from 2.2, but I haven’t checked. I am
>> fairly sure 4.x supported upgrade from 3.0.x also.
>> >>
>> >>
>> >>> On 11 Dec 2024, at 12:53, Miklosovic, Stefan via dev <
>> dev@cassandra.apache.org> wrote:
>> >>>
>> >>> I see. That makes sense. I think that by 3.x you meant basically the
>> latest 3.11, right? I guess 2.2 -> 3.0 already works, we would just try to
>> support 2.2 -> 3.11 straight away. I need to check where we are at in that
>> area.
>> >>>
>> >>> 
>> >>> From: Benedict mailto:bened...@a

Re: Supporting 2.2 -> 5.0 upgrades

2024-12-12 Thread David Capwell
> but I still find myself very rarely interacting with ant

I think that is where most people are as not many actually maintain or modify 
ant… there are so many things that bug me (lack of cache, making sure new 
people use the right version (was totally fun to learn 4.1 didn’t build with 
the default ant that got installed when I was helping out a new higher… we had 
to downgrade… woohoo….), hand rolled IDE integration, etc…), but I would 
disagree that ant is a blocker for different jars… we could switch off ant for 
Makefile, or bash, and we would still be able to produce new jars… Its more of… 
how many people actually feel comfortable enough to alter our build system to 
make such a change?  I don’t…

Our build system does not impact our ability to offer migration from 2.2 to 
5.0… so don’t want to keep distracting this thread…

TL;DR - in progress migration off 2.2 to 5.0 is annoying as there were 
different bugs in the past we have to support again.  Out of process migration 
to me feels far more plausible, but feels annoying without splitting off our 
reader/writer… doable… just more annoying...

> On Dec 12, 2024, at 9:04 AM, Alex Petrov  wrote:
> 
> > I have for a while advocated for a shared lib to also share between Harry, 
> > accord, dtests etc
> 
> Big +1 for a shared lib for our concurrency and test utils. Been intending to 
> start working on this for a while now, but never got to do this so far.
> 
> On Thu, Dec 12, 2024, at 5:58 PM, Benedict wrote:
>> 
>> Why would ant get in the way? We already build multiple jars, and accord 
>> will be a submodule. We have far more organisational issues to overcome than 
>> ant.
>> 
>> I have for a while advocated for a shared lib to also share between Harry, 
>> accord, dtests etc
>> 
>> I am however not 100% sure about splitting read/write path, at least not as 
>> first posited. The idea of maintaining it as an API for dropping in 
>> different jars is a whole other world of potential pain I don’t want to 
>> countenance. Supporting eg bulk readers or writers or other integrations 
>> seems pretty feasible though.
>> 
>> 
>>> On 12 Dec 2024, at 16:53, Paulo Motta  wrote:
>>> 
>>> >  I think that will not happen until we are out of Ant as doing this multi 
>>> > jar / subproject mumbo jumbo is not too much appealing to ... anybody?
>>> 
>>> This is a contentious/controversial topic, but the more I work with gradle 
>>> the more I lean towards ant's simplicity. That said, I'd support moving 
>>> away if it becomes a technical blocker to break up cassandra-all - and if 
>>> this happen I would vote for maven as replacement. :-D
>>> 
>>> On Thu, Dec 12, 2024 at 11:42 AM Miklosovic, Stefan via dev 
>>> mailto:dev@cassandra.apache.org>> wrote:
>>> These are all good ideas but in practical terms I think that will not 
>>> happen until we are out of Ant as doing this multi jar / subproject mumbo 
>>> jumbo is not too much appealing to ... anybody?
>>> 
>>> 
>>> From: Paulo Motta mailto:pa...@apache.org>>
>>> Sent: Thursday, December 12, 2024 17:35
>>> To: dev@cassandra.apache.org 
>>> Subject: Re: Supporting 2.2 -> 5.0 upgrades
>>> 
>>> EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
>>> 
>>> 
>>> 
>>> >  +1 on moving the read/write logic into its own jar.
>>> 
>>> +1, not only read-write logic but anything used by both the server and 
>>> subprojects (ie. cassandra-sidecar), for example JMX Mbeans and other 
>>> interfaces.
>>> 
>>> I think one way to do that would be to split cassandra-all into 
>>> cassandra-server and cassandra-common (anything used by both subprojects 
>>> and server), but not sure if this would be feasible or what it would take.
>>> 
>>> If there's loose agreement this would be a feasible path I'd be happy to 
>>> create a JIRA to investigate what this would take.
>>> 
>>> On Thu, Dec 12, 2024 at 11:26 AM Doug Rohrer >> >> >> wrote:
>>> +1 on moving the read/write logic into its own jar.
>>> 
>>> Doug
>>> 
>>> > On Dec 11, 2024, at 7:21 PM, David Capwell >> > >> > >> wrote:
>>> >
>>> > From a disk format point of view the only thing I remember was the disk 
>>> > type bug with UDTs.  Bringing that logic back was hard as the type system 
>>> > (in 5.0) tries to avoid allowing construction of invalid states, and we 
>>> > would need to weaken that in order to enable the migration. Assuming the 
>>> > user migrated from 3.x to 4.x then the sstable metadata should have been 
>>> > rewritten to fix this bug.
>>> >
>>> > One thought (though know its a ton of effort).. we have talked about for 
>>> > a long time about moving the reading/writing logic into its jar (so tools 
>>> > don’t need cassandra-all and can limit the dependencies)… if we did that 
>>> > we could try to solve this as an 

Re: [DISCUSS] Experimental flagging (fork from Re-evaluate compaction defaults in 5.1/trunk)

2024-12-12 Thread Benedict
I don’t think they should be called or treated as the same feature. Or at least we should rebrand. They also have quite different properties.I would prefer to introduce “Global Indexes” backed by accord, since this is also a clearer name, and gives us a clean break from the mess of MVs. We can decide if we want to also retain eventually consistent views on the data, and what we want to call them, and if so whether we want to introduce an isolation parameter at the same time or keep them well demarcated.On 12 Dec 2024, at 15:04, Paulo Motta  wrote:> If we don’t intend to begin fixing the feature within the next year or so we should deprecate it entirely.+1 - this is probably topic for another thread but isn’t MVs fundamentally solved with Accord? In my ignorance this is “just” a matter of adding an Accord backend to MV syntax to fix it reliably.On Thu, 12 Dec 2024 at 09:58 Benedict  wrote:I think alpha is fine. It communicates fairly well that there’s no near term expectation they will be production capable. There is (I think) still an intention to improve them, but they are janky. If we don’t intend to begin fixing the feature within the next year or so we should deprecate it entirely.On 12 Dec 2024, at 14:46, Aleksey Yeshchenko  wrote:But MVs are not alpha or preview, as they are not actively being worked on. They are currently broken. Calling them ‘alpha’ makes ‘alpha’ overloaded and less useful.On 12 Dec 2024, at 14:00, Josh McKenzie  wrote:But we also need an approved non-euphemism for features like MVs (I suggest ‘broken’) and possibly a softer version of it ('dangerous') for our existing features that work fine in some narrow well-defined circumstances but will blow in your face if you don’t know exactly what you are doing.Feels like the real answer is:Endeavor to never get ourselves into this stateTake immediate action if we discover we're there (fix feature if possible, deprecate and remove if not). Not "leave to fester for years"I like the introduction of 'alpha' as an alias for 'Preview'; not sure why that wasn't what we immediately came up with collectively given how widespread its usage is. :)What would demoting MV's to 'alpha' right now look like? We'd warn on their usage w/some different structure and verbiage, and it'd be pretty implicitly clear to people they shouldn't use it in production right?It seems to me that the 3 categories would be sufficient even to handle our current scenario where we have some things in the system that are a Bad Idea to use in production.On Thu, Dec 12, 2024, at 6:06 AM, Aleksey Yeshchenko wrote:I don’t like ‘unstable’ either, albeit for a different reason, but I don’t think three is enough and fits, as we already have some features that don’t fit into either of (preview,beta,ga) - released but broken, released but dangerous, deprecated, removed.For new features going forward, alpha (preview) -> beta -> GA works well enough.But we also need an approved non-euphemism for features like MVs (I suggest ‘broken’) and possibly a softer version of it ('dangerous') for our existing features that work fine in some narrow well-defined circumstances but will blow in your face if you don’t know exactly what you are doing.These classifications are largely orthogonal.Alpha(preview)->Beta->GA communicates readiness of a feature under development, with GA being the default final state for most features.From there a feature can transition into ‘broken’ or ‘dangerous’ territory. Serious issues get uncovered (very) late sometimes. It is what it is.And we do deprecate and remove functionality when it’s superseded.-1 on unstable. It's way too many words than are needed. Three is amagic number and fits:PreviewBetaGAOn 11 Dec 2024, at 18:50, Josh McKenzie  wrote:A structured, disciplined approach to graduating something from [Optional] -> [Default] makes sense to me, similar to how we're talking about a structured flow of [Preview] -> [Beta] -> [GA]. Having those clear stages gives us a framework to define what requirements of stage transitions would be which'll ideally lead to us producing higher quality, more predictable, more consistent results for our end users.For instance, requirements from [Optional] -> [Default] could be higher level abstractions like:Confidence in stabilityStrong evidence to indicate superiority in majority of workloads (by count or importance or size, etc)These are all things we kind of do implicitly and ad-hoc on the mailing list, and I'm not looking to tie us down to any granular structure or specificity. More thinking it could be useful for someone that's worked on something who wonders "Huh. How do I take this from being optional to the default?" and having an answer better than "reinvent the wheel every time and fling spaghetti at the dev list and pray".:)On Wed, Dec 11, 2024, at 1:04 PM, Paulo Motta wrote:Thanks for bringing up this topic, Josh. Outside of the major f

Proposal for OpenTelemetry Tracing Support in Apache Cassandra Java Driver

2024-12-12 Thread Jane H
Hi all,

OpenTelemetry has become the industry standard for telemetry data in
distributed systems. Tracing, in particular, enables developers to
track the full "path" a request takes through the application,
providing deep insights into distributed applications.

I have drafted a proposal to integrate native support for
OpenTelemetry tracing into the Apache Cassandra Java Driver. I would
greatly appreciate your feedback on the proposal!

The proposal: 
https://github.com/SiyaoIsHiding/java-driver/blob/otel-proposal/proposals/open-telemetry/tracing.md
The pull request: https://github.com/apache/cassandra-java-driver/pull/1994

Looking forward to your thoughts!

Best regards,
Jane He


Re: [DISCUSS] Experimental flagging (fork from Re-evaluate compaction defaults in 5.1/trunk)

2024-12-12 Thread Paulo Motta
Thanks for clarifying your view and sorry for the diversion on the thread
topic, let’s get back to it. It looks like this warrants its own discussion
on future of MVs in the era of accord (whether we still want to provide
eventually consistent MVs in the current format, or remove it in favor of
Accord-backed global indexes).

On Thu, 12 Dec 2024 at 10:10 Benedict  wrote:

> I don’t think they should be called or treated as the same feature. Or at
> least we should rebrand. They also have quite different properties.
>
> I would prefer to introduce “Global Indexes” backed by accord, since this
> is also a clearer name, and gives us a clean break from the mess of MVs. We
> can decide if we want to also retain eventually consistent views on the
> data, and what we want to call them, and if so whether we want to introduce
> an isolation parameter at the same time or keep them well demarcated.
>
> On 12 Dec 2024, at 15:04, Paulo Motta  wrote:
>
> 
>
> > If we don’t intend to begin fixing the feature within the next year or
> so we should deprecate it entirely.
>
> +1 - this is probably topic for another thread but isn’t MVs fundamentally
> solved with Accord? In my ignorance this is “just” a matter of adding an
> Accord backend to MV syntax to fix it reliably.
>
> On Thu, 12 Dec 2024 at 09:58 Benedict  wrote:
>
>> I think alpha is fine. It communicates fairly well that there’s no near
>> term expectation they will be production capable.
>>
>> There is (I think) still an intention to improve them, but they are
>> janky. If we don’t intend to begin fixing the feature within the next year
>> or so we should deprecate it entirely.
>>
>>
>> On 12 Dec 2024, at 14:46, Aleksey Yeshchenko  wrote:
>>
>> But MVs are not alpha or preview, as they are not actively being worked
>> on. They are currently broken. Calling them ‘alpha’ makes ‘alpha’
>> overloaded and less useful.
>>
>>
>>
>> On 12 Dec 2024, at 14:00, Josh McKenzie  wrote:
>>
>> But we also need an approved non-euphemism for features like MVs (I
>> suggest ‘broken’) and possibly a softer version of it ('dangerous') for
>> our existing features that work fine in some narrow well-defined
>> circumstances but will blow in your face if you don’t know exactly what you
>> are doing.
>>
>> Feels like the real answer is:
>>
>>1. Endeavor to never get ourselves into this state
>>2. Take immediate action if we discover we're there (fix feature if
>>possible, deprecate and remove if not). Not "leave to fester for years"
>>
>> I like the introduction of 'alpha' as an alias for 'Preview'; not sure
>> why that wasn't what we immediately came up with collectively given how
>> widespread its usage is. :)
>>
>> What would demoting MV's to 'alpha' right now look like? We'd warn on
>> their usage w/some different structure and verbiage, and it'd be pretty
>> implicitly clear to people they shouldn't use it in production right?
>>
>> It seems to me that the 3 categories would be sufficient even to handle
>> our current scenario where we have some things in the system that are a Bad
>> Idea to use in production.
>>
>> On Thu, Dec 12, 2024, at 6:06 AM, Aleksey Yeshchenko wrote:
>>
>> I don’t like ‘unstable’ either, albeit for a different reason, but I
>> don’t think three is enough and fits, as we already have some features that
>> don’t fit into either of (preview,beta,ga) - released but broken, released
>> but dangerous, deprecated, removed.
>>
>> For new features going forward, alpha (preview) -> beta -> GA works well
>> enough.
>>
>> But we also need an approved non-euphemism for features like MVs (I
>> suggest ‘broken’) and possibly a softer version of it ('dangerous') for
>> our existing features that work fine in some narrow well-defined
>> circumstances but will blow in your face if you don’t know exactly what you
>> are doing.
>>
>> These classifications are largely orthogonal.
>>
>> Alpha(preview)->Beta->GA communicates readiness of a feature under
>> development, with GA being the default final state for most features.
>>
>> From there a feature can transition into ‘broken’ or ‘dangerous’
>> territory. Serious issues get uncovered (very) late sometimes. It is what
>> it is.
>> And we do deprecate and remove functionality when it’s superseded.
>>
>>
>> -1 on unstable. It's way too many words than are needed. Three is a
>> magic number and fits:
>>
>> Preview
>> Beta
>> GA
>>
>>
>> On 11 Dec 2024, at 18:50, Josh McKenzie  wrote:
>>
>> A structured, disciplined approach to graduating something from
>> [Optional] -> [Default] makes sense to me, similar to how we're talking
>> about a structured flow of [Preview] -> [Beta] -> [GA]. Having those clear
>> stages gives us a framework to define what requirements of stage
>> transitions would be which'll ideally lead to us producing higher quality,
>> more predictable, more consistent results for our end users.
>>
>> For instance, requirements from [Optional] -> [Default] could be higher
>> level abstractions like:
>>
>>

Re: [DISCUSS] Experimental flagging (fork from Re-evaluate compaction defaults in 5.1/trunk)

2024-12-12 Thread Brandon Williams
I think it would be better to avoid 'alpha' since we do beta releases,
and I agree with Aleksey that we'd be overloading 'alpha' and perhaps
causing confusion.

Kind Regards,
Brandon

On Thu, Dec 12, 2024 at 8:58 AM Benedict  wrote:
>
> I think alpha is fine. It communicates fairly well that there’s no near term 
> expectation they will be production capable.
>
> There is (I think) still an intention to improve them, but they are janky. If 
> we don’t intend to begin fixing the feature within the next year or so we 
> should deprecate it entirely.
>
>
> On 12 Dec 2024, at 14:46, Aleksey Yeshchenko  wrote:
>
> But MVs are not alpha or preview, as they are not actively being worked on. 
> They are currently broken. Calling them ‘alpha’ makes ‘alpha’ overloaded and 
> less useful.
>
> On 12 Dec 2024, at 14:00, Josh McKenzie  wrote:
>
> But we also need an approved non-euphemism for features like MVs (I suggest 
> ‘broken’) and possibly a softer version of it ('dangerous') for our existing 
> features that work fine in some narrow well-defined circumstances but will 
> blow in your face if you don’t know exactly what you are doing.
>
> Feels like the real answer is:
>
> Endeavor to never get ourselves into this state
> Take immediate action if we discover we're there (fix feature if possible, 
> deprecate and remove if not). Not "leave to fester for years"
>
> I like the introduction of 'alpha' as an alias for 'Preview'; not sure why 
> that wasn't what we immediately came up with collectively given how 
> widespread its usage is. :)
>
> What would demoting MV's to 'alpha' right now look like? We'd warn on their 
> usage w/some different structure and verbiage, and it'd be pretty implicitly 
> clear to people they shouldn't use it in production right?
>
> It seems to me that the 3 categories would be sufficient even to handle our 
> current scenario where we have some things in the system that are a Bad Idea 
> to use in production.
>
> On Thu, Dec 12, 2024, at 6:06 AM, Aleksey Yeshchenko wrote:
>
> I don’t like ‘unstable’ either, albeit for a different reason, but I don’t 
> think three is enough and fits, as we already have some features that don’t 
> fit into either of (preview,beta,ga) - released but broken, released but 
> dangerous, deprecated, removed.
>
> For new features going forward, alpha (preview) -> beta -> GA works well 
> enough.
>
> But we also need an approved non-euphemism for features like MVs (I suggest 
> ‘broken’) and possibly a softer version of it ('dangerous') for our existing 
> features that work fine in some narrow well-defined circumstances but will 
> blow in your face if you don’t know exactly what you are doing.
>
> These classifications are largely orthogonal.
>
> Alpha(preview)->Beta->GA communicates readiness of a feature under 
> development, with GA being the default final state for most features.
>
> From there a feature can transition into ‘broken’ or ‘dangerous’ territory. 
> Serious issues get uncovered (very) late sometimes. It is what it is.
> And we do deprecate and remove functionality when it’s superseded.
>
>
> -1 on unstable. It's way too many words than are needed. Three is a
> magic number and fits:
>
> Preview
> Beta
> GA
>
>
> On 11 Dec 2024, at 18:50, Josh McKenzie  wrote:
>
> A structured, disciplined approach to graduating something from [Optional] -> 
> [Default] makes sense to me, similar to how we're talking about a structured 
> flow of [Preview] -> [Beta] -> [GA]. Having those clear stages gives us a 
> framework to define what requirements of stage transitions would be which'll 
> ideally lead to us producing higher quality, more predictable, more 
> consistent results for our end users.
>
> For instance, requirements from [Optional] -> [Default] could be higher level 
> abstractions like:
>
> Confidence in stability
> Strong evidence to indicate superiority in majority of workloads (by count or 
> importance or size, etc)
>
> These are all things we kind of do implicitly and ad-hoc on the mailing list, 
> and I'm not looking to tie us down to any granular structure or specificity. 
> More thinking it could be useful for someone that's worked on something who 
> wonders "Huh. How do I take this from being optional to the default?" and 
> having an answer better than "reinvent the wheel every time and fling 
> spaghetti at the dev list and pray".
>
> :)
>
>
> On Wed, Dec 11, 2024, at 1:04 PM, Paulo Motta wrote:
>
> Thanks for bringing up this topic, Josh.
>
> Outside of the major features (ie. MV/SAI/TCM/Accord), one related discussion 
> in this topic is: how can we "promote" small improvements in existing 
> features from optional to default ?
>
> It makes sense to have optimizations launched behind a feature flag initially 
> (beta phase) while the improvement gets real world exposure, but I think we 
> need a better way to promote these optimizations to default behavior on a 
> regular cadence.
>
> Take for example optimized repairs from CASSAND

Re: [DISCUSS] Experimental flagging (fork from Re-evaluate compaction defaults in 5.1/trunk)

2024-12-12 Thread Josh McKenzie
> But MVs are not alpha or preview, as they are not actively being worked on. 
> They are currently broken. Calling them ‘alpha’ makes ‘alpha’ overloaded and 
> less useful.
I'm asserting they should either be marked 'alpha/Preview' and actively being 
worked on, or be deprecated and removed.

To Paulo's point, Jaydeep's working on base<->view reconciliation which, to me, 
would indicate they're alpha. And until we have a solution for that 
reconciliation in the face of lost data in the base table, no global index or 
MV implementation will be ready for production.

On Thu, Dec 12, 2024, at 10:40 AM, Brandon Williams wrote:
> I think it would be better to avoid 'alpha' since we do beta releases,
> and I agree with Aleksey that we'd be overloading 'alpha' and perhaps
> causing confusion.
> 
> Kind Regards,
> Brandon
> 
> On Thu, Dec 12, 2024 at 8:58 AM Benedict  wrote:
> >
> > I think alpha is fine. It communicates fairly well that there’s no near 
> > term expectation they will be production capable.
> >
> > There is (I think) still an intention to improve them, but they are janky. 
> > If we don’t intend to begin fixing the feature within the next year or so 
> > we should deprecate it entirely.
> >
> >
> > On 12 Dec 2024, at 14:46, Aleksey Yeshchenko  wrote:
> >
> > But MVs are not alpha or preview, as they are not actively being worked 
> > on. They are currently broken. Calling them ‘alpha’ makes ‘alpha’ 
> > overloaded and less useful.
> >
> > On 12 Dec 2024, at 14:00, Josh McKenzie  wrote:
> >
> > But we also need an approved non-euphemism for features like MVs (I suggest 
> > ‘broken’) and possibly a softer version of it ('dangerous') for our 
> > existing features that work fine in some narrow well-defined circumstances 
> > but will blow in your face if you don’t know exactly what you are doing.
> >
> > Feels like the real answer is:
> >
> > Endeavor to never get ourselves into this state
> > Take immediate action if we discover we're there (fix feature if possible, 
> > deprecate and remove if not). Not "leave to fester for years"
> >
> > I like the introduction of 'alpha' as an alias for 'Preview'; not sure why 
> > that wasn't what we immediately came up with collectively given how 
> > widespread its usage is. :)
> >
> > What would demoting MV's to 'alpha' right now look like? We'd warn on their 
> > usage w/some different structure and verbiage, and it'd be pretty 
> > implicitly clear to people they shouldn't use it in production right?
> >
> > It seems to me that the 3 categories would be sufficient even to handle our 
> > current scenario where we have some things in the system that are a Bad 
> > Idea to use in production.
> >
> > On Thu, Dec 12, 2024, at 6:06 AM, Aleksey Yeshchenko wrote:
> >
> > I don’t like ‘unstable’ either, albeit for a different reason, but I don’t 
> > think three is enough and fits, as we already have some features that don’t 
> > fit into either of (preview,beta,ga) - released but broken, released but 
> > dangerous, deprecated, removed.
> >
> > For new features going forward, alpha (preview) -> beta -> GA works well 
> > enough.
> >
> > But we also need an approved non-euphemism for features like MVs (I suggest 
> > ‘broken’) and possibly a softer version of it ('dangerous') for our 
> > existing features that work fine in some narrow well-defined circumstances 
> > but will blow in your face if you don’t know exactly what you are doing.
> >
> > These classifications are largely orthogonal.
> >
> > Alpha(preview)->Beta->GA communicates readiness of a feature under 
> > development, with GA being the default final state for most features.
> >
> > From there a feature can transition into ‘broken’ or ‘dangerous’ territory. 
> > Serious issues get uncovered (very) late sometimes. It is what it is.
> > And we do deprecate and remove functionality when it’s superseded.
> >
> >
> > -1 on unstable. It's way too many words than are needed. Three is a
> > magic number and fits:
> >
> > Preview
> > Beta
> > GA
> >
> >
> > On 11 Dec 2024, at 18:50, Josh McKenzie  wrote:
> >
> > A structured, disciplined approach to graduating something from [Optional] 
> > -> [Default] makes sense to me, similar to how we're talking about a 
> > structured flow of [Preview] -> [Beta] -> [GA]. Having those clear stages 
> > gives us a framework to define what requirements of stage transitions would 
> > be which'll ideally lead to us producing higher quality, more predictable, 
> > more consistent results for our end users.
> >
> > For instance, requirements from [Optional] -> [Default] could be higher 
> > level abstractions like:
> >
> > Confidence in stability
> > Strong evidence to indicate superiority in majority of workloads (by count 
> > or importance or size, etc)
> >
> > These are all things we kind of do implicitly and ad-hoc on the mailing 
> > list, and I'm not looking to tie us down to any granular structure or 
> > specificity. More thinking it could be useful for someone tha

Re: [DISCUSS] Deprecation of IEndpointSnitch (CASSANDRA-19488)

2024-12-12 Thread guo Maxwell
Hi sam
I can help with the validation of AlibabaCloudSnith.

Sam Tunnicliffe 于2024年12月12日 周四下午9:20写道:

> This patch is probably now ready to merge, having been through several
> iterations of review and with green CI. Before that though, I just want to
> send one more reminder about it. We've endeavoured to preserve all existing
> behaviour and to keep configuration 100% backwards compatible. However,
> some areas have had minimal testing in real clusters, specifically the
> various cloud platform configurations:
>
> * Ec2Snitch/Ec2MultiRegionSnitch
> * AzureSnitch
> * AlibabaCloudSnitch
> * GoogleCloudSnitch
> * CloudstackSnitch
>
> Any help in validating these in their native environments would be welcome.
>
> The other consideration is toward custom snitch implementations. The
> intention is that these should continue to work without interruption or
> intervention, unless they're leaning heavily on C* internals in which case
> any changes required ought to be minimal. So it would be great if anyone
> using a custom snitch implementation is able to check it out and help
> verify that.
>
>
> > On 31 Oct 2024, at 16:53, Sam Tunnicliffe  wrote:
> >
> > Since CEP-21, the source of truth for topology info (a node's datacenter
> & rack) is ClusterMetadata. Each node provides its dc/rack when it
> registers itself with the cluster prior to joining and this information is
> effectively immutable (for now). This significantly reduces the scope of
> IEndpointSnitch's responsibilities and CASSANDRA-19488 proposes a
> refactoring which breaks out the remaining functionality into a handful of
> new providers (full details can be found in the JIRA).
> >
> > This is one of the more widely used extension points in Cassandra, so we
> wanted to bring it to the mailing list in addition to discussing on JIRA.
> >
> > To be clear, no operator intervention should be necessary when
> upgrading. To ease migration onto the new config and to allow us to
> deprecate snitches in a controlled way, it will remain fully supported to
> configure nodes using the endpoint_snitch setting in yaml. A SnitchAdapter
> acts as a facade in this case, presenting the new interfaces to calling
> code while delegating to the legacy snitch. Most of the in-tree snitches
> have been refactored to extract implementations of the new interfaces so
> that their functionality can be used via the new configuration.
> >
> > Some questions for the list:
> >
> > * We have added 2 new methods to IEndpointSnitch, which have essentially
> been pulled up from Ec2MultiRegionSnitch and GossipingPropertyFileSnitch to
> support ReconnectableSnitchHelper. Currently, these are added as default
> methods on the interface so that out-of-tree snitches remain binary
> compatible. However, it would be safer to break binary compatibility in
> this case to ensure that any custom snitches out in the wild must be
> updated and their behaviour is preserved. So the question is, would there
> be objections to extending the (now deprecated) IEndpointSnitch interface
> in this way?
> >
> > * Python dtests and config are currently unchanged (aside from some
> error message checks) so these are exercising the path whereby the clusters
> are configured with endpoint_snitch and make use of the compatibility
> adapter. In-jvm upgrade dtests switch from old to new style configuration
> on upgrade to 5.1 (though in truth, these don't exercise snitches much at
> all as a special dtest snitch is used throughout). cassandra-latest.yaml
> contains the new settings, while cassandra.yaml and the variations in
> test/conf retain the old style settings. How should we approach updating
> these configs so that we maintain a balance between test coverage,
> compatibility during upgrades and encouraging the use of new style config
> in fresh clusters?
> >
>
>


Re: [DISCUSS] Experimental flagging (fork from Re-evaluate compaction defaults in 5.1/trunk)

2024-12-12 Thread Aleksey Yeshchenko
But MVs are not alpha or preview, as they are not actively being worked on. 
They are currently broken. Calling them ‘alpha’ makes ‘alpha’ overloaded and 
less useful.

> On 12 Dec 2024, at 14:00, Josh McKenzie  wrote:
> 
>> But we also need an approved non-euphemism for features like MVs (I suggest 
>> ‘broken’) and possibly a softer version of it ('dangerous') for our existing 
>> features that work fine in some narrow well-defined circumstances but will 
>> blow in your face if you don’t know exactly what you are doing.
> Feels like the real answer is:
> Endeavor to never get ourselves into this state
> Take immediate action if we discover we're there (fix feature if possible, 
> deprecate and remove if not). Not "leave to fester for years"
> I like the introduction of 'alpha' as an alias for 'Preview'; not sure why 
> that wasn't what we immediately came up with collectively given how 
> widespread its usage is. :)
> 
> What would demoting MV's to 'alpha' right now look like? We'd warn on their 
> usage w/some different structure and verbiage, and it'd be pretty implicitly 
> clear to people they shouldn't use it in production right?
> 
> It seems to me that the 3 categories would be sufficient even to handle our 
> current scenario where we have some things in the system that are a Bad Idea 
> to use in production.
> 
> On Thu, Dec 12, 2024, at 6:06 AM, Aleksey Yeshchenko wrote:
>> I don’t like ‘unstable’ either, albeit for a different reason, but I don’t 
>> think three is enough and fits, as we already have some features that don’t 
>> fit into either of (preview,beta,ga) - released but broken, released but 
>> dangerous, deprecated, removed.
>> 
>> For new features going forward, alpha (preview) -> beta -> GA works well 
>> enough.
>> 
>> But we also need an approved non-euphemism for features like MVs (I suggest 
>> ‘broken’) and possibly a softer version of it ('dangerous') for our existing 
>> features that work fine in some narrow well-defined circumstances but will 
>> blow in your face if you don’t know exactly what you are doing.
>> 
>> These classifications are largely orthogonal.
>> 
>> Alpha(preview)->Beta->GA communicates readiness of a feature under 
>> development, with GA being the default final state for most features.
>> 
>> From there a feature can transition into ‘broken’ or ‘dangerous’ territory. 
>> Serious issues get uncovered (very) late sometimes. It is what it is.
>> And we do deprecate and remove functionality when it’s superseded.
>> 
>> 
>>> -1 on unstable. It's way too many words than are needed. Three is a
>>> magic number and fits:
>>> 
>>> Preview
>>> Beta
>>> GA
>> 
>>> On 11 Dec 2024, at 18:50, Josh McKenzie  wrote:
>>> 
>>> A structured, disciplined approach to graduating something from [Optional] 
>>> -> [Default] makes sense to me, similar to how we're talking about a 
>>> structured flow of [Preview] -> [Beta] -> [GA]. Having those clear stages 
>>> gives us a framework to define what requirements of stage transitions would 
>>> be which'll ideally lead to us producing higher quality, more predictable, 
>>> more consistent results for our end users.
>>> 
>>> For instance, requirements from [Optional] -> [Default] could be higher 
>>> level abstractions like:
>>> Confidence in stability
>>> Strong evidence to indicate superiority in majority of workloads (by count 
>>> or importance or size, etc)
>>> These are all things we kind of do implicitly and ad-hoc on the mailing 
>>> list, and I'm not looking to tie us down to any granular structure or 
>>> specificity. More thinking it could be useful for someone that's worked on 
>>> something who wonders "Huh. How do I take this from being optional to the 
>>> default?" and having an answer better than "reinvent the wheel every time 
>>> and fling spaghetti at the dev list and pray".
>>> 
>>> :)
>>> 
>>> 
>>> On Wed, Dec 11, 2024, at 1:04 PM, Paulo Motta wrote:
 Thanks for bringing up this topic, Josh. 
 
 Outside of the major features (ie. MV/SAI/TCM/Accord), one related 
 discussion in this topic is: how can we "promote" small improvements in 
 existing features from optional to default ?
 
 It makes sense to have optimizations launched behind a feature flag 
 initially (beta phase) while the improvement gets real world exposure, but 
 I think we need a better way to promote these optimizations to default 
 behavior on a regular cadence.
 
 Take for example optimized repairs from CASSANDRA-16274. It was launched 
 in 4.x as an optional feature gated behind a flag, ie. 
 auto_optimise_full_repair_streams: false. 
 
 I could be easily missing something, but is there a world where 
 non-optimized repairs make sense once this optimization is proven to work 
 ? I agree this is fine while the feature is maturing, but at some point we 
 need to rip the bandaid and make the optimization default (and clearly 
 communicate that). This would allow 

Re: [DISCUSS] Experimental flagging (fork from Re-evaluate compaction defaults in 5.1/trunk)

2024-12-12 Thread Paulo Motta
> But MVs are not alpha or preview, as they are not actively being worked
on.

fwiw I think Jaydeep and Runtian are looking into improving MV status quo
according to
https://lists.apache.org/thread/d3qo3vjxn4116htf175yzcg94s6jq07d

On Thu, 12 Dec 2024 at 09:45 Aleksey Yeshchenko  wrote:

> But MVs are not alpha or preview, as they are not actively being worked
> on. They are currently broken. Calling them ‘alpha’ makes ‘alpha’
> overloaded and less useful.
>
>
> On 12 Dec 2024, at 14:00, Josh McKenzie  wrote:
>
> But we also need an approved non-euphemism for features like MVs (I
> suggest ‘broken’) and possibly a softer version of it ('dangerous') for
> our existing features that work fine in some narrow well-defined
> circumstances but will blow in your face if you don’t know exactly what you
> are doing.
>
> Feels like the real answer is:
>
>1. Endeavor to never get ourselves into this state
>2. Take immediate action if we discover we're there (fix feature if
>possible, deprecate and remove if not). Not "leave to fester for years"
>
> I like the introduction of 'alpha' as an alias for 'Preview'; not sure why
> that wasn't what we immediately came up with collectively given how
> widespread its usage is. :)
>
> What would demoting MV's to 'alpha' right now look like? We'd warn on
> their usage w/some different structure and verbiage, and it'd be pretty
> implicitly clear to people they shouldn't use it in production right?
>
> It seems to me that the 3 categories would be sufficient even to handle
> our current scenario where we have some things in the system that are a Bad
> Idea to use in production.
>
> On Thu, Dec 12, 2024, at 6:06 AM, Aleksey Yeshchenko wrote:
>
> I don’t like ‘unstable’ either, albeit for a different reason, but I don’t
> think three is enough and fits, as we already have some features that don’t
> fit into either of (preview,beta,ga) - released but broken, released but
> dangerous, deprecated, removed.
>
> For new features going forward, alpha (preview) -> beta -> GA works well
> enough.
>
> But we also need an approved non-euphemism for features like MVs (I
> suggest ‘broken’) and possibly a softer version of it ('dangerous') for
> our existing features that work fine in some narrow well-defined
> circumstances but will blow in your face if you don’t know exactly what you
> are doing.
>
> These classifications are largely orthogonal.
>
> Alpha(preview)->Beta->GA communicates readiness of a feature under
> development, with GA being the default final state for most features.
>
> From there a feature can transition into ‘broken’ or ‘dangerous’
> territory. Serious issues get uncovered (very) late sometimes. It is what
> it is.
> And we do deprecate and remove functionality when it’s superseded.
>
>
> -1 on unstable. It's way too many words than are needed. Three is a
> magic number and fits:
>
> Preview
> Beta
> GA
>
>
> On 11 Dec 2024, at 18:50, Josh McKenzie  wrote:
>
> A structured, disciplined approach to graduating something from [Optional]
> -> [Default] makes sense to me, similar to how we're talking about a
> structured flow of [Preview] -> [Beta] -> [GA]. Having those clear stages
> gives us a framework to define what requirements of stage transitions would
> be which'll ideally lead to us producing higher quality, more predictable,
> more consistent results for our end users.
>
> For instance, requirements from [Optional] -> [Default] could be higher
> level abstractions like:
>
>- Confidence in stability
>- Strong evidence to indicate superiority in majority of workloads (by
>count or importance or size, etc)
>
> These are all things we kind of do implicitly and ad-hoc on the mailing
> list, and I'm not looking to tie us down to any granular structure or
> specificity. More thinking it could be useful for someone that's worked on
> something who wonders "Huh. How do I take this from being optional to the
> default?" and having an answer better than "reinvent the wheel every time
> and fling spaghetti at the dev list and pray".
>
> :)
>
>
> On Wed, Dec 11, 2024, at 1:04 PM, Paulo Motta wrote:
>
> Thanks for bringing up this topic, Josh.
>
> Outside of the major features (ie. MV/SAI/TCM/Accord), one related
> discussion in this topic is: how can we "promote" small improvements in
> existing features from optional to default ?
>
> It makes sense to have optimizations launched behind a feature flag
> initially (beta phase) while the improvement gets real world exposure, but
> I think we need a better way to promote these optimizations to default
> behavior on a regular cadence.
>
> Take for example optimized repairs from CASSANDRA-16274. It was launched
> in 4.x as an optional feature gated behind a flag,
> ie. auto_optimise_full_repair_streams: false.
>
> I could be easily missing something, but is there a world where
> non-optimized repairs make sense once this optimization is proven to work ?
> I agree this is fine while the feature is maturing, bu

Re: [DISCUSS] Experimental flagging (fork from Re-evaluate compaction defaults in 5.1/trunk)

2024-12-12 Thread Benedict
I think alpha is fine. It communicates fairly well that there’s no near term expectation they will be production capable. There is (I think) still an intention to improve them, but they are janky. If we don’t intend to begin fixing the feature within the next year or so we should deprecate it entirely.On 12 Dec 2024, at 14:46, Aleksey Yeshchenko  wrote:But MVs are not alpha or preview, as they are not actively being worked on. They are currently broken. Calling them ‘alpha’ makes ‘alpha’ overloaded and less useful.On 12 Dec 2024, at 14:00, Josh McKenzie  wrote:But we also need an approved non-euphemism for features like MVs (I suggest ‘broken’) and possibly a softer version of it ('dangerous') for our existing features that work fine in some narrow well-defined circumstances but will blow in your face if you don’t know exactly what you are doing.Feels like the real answer is:Endeavor to never get ourselves into this stateTake immediate action if we discover we're there (fix feature if possible, deprecate and remove if not). Not "leave to fester for years"I like the introduction of 'alpha' as an alias for 'Preview'; not sure why that wasn't what we immediately came up with collectively given how widespread its usage is. :)What would demoting MV's to 'alpha' right now look like? We'd warn on their usage w/some different structure and verbiage, and it'd be pretty implicitly clear to people they shouldn't use it in production right?It seems to me that the 3 categories would be sufficient even to handle our current scenario where we have some things in the system that are a Bad Idea to use in production.On Thu, Dec 12, 2024, at 6:06 AM, Aleksey Yeshchenko wrote:I don’t like ‘unstable’ either, albeit for a different reason, but I don’t think three is enough and fits, as we already have some features that don’t fit into either of (preview,beta,ga) - released but broken, released but dangerous, deprecated, removed.For new features going forward, alpha (preview) -> beta -> GA works well enough.But we also need an approved non-euphemism for features like MVs (I suggest ‘broken’) and possibly a softer version of it ('dangerous') for our existing features that work fine in some narrow well-defined circumstances but will blow in your face if you don’t know exactly what you are doing.These classifications are largely orthogonal.Alpha(preview)->Beta->GA communicates readiness of a feature under development, with GA being the default final state for most features.From there a feature can transition into ‘broken’ or ‘dangerous’ territory. Serious issues get uncovered (very) late sometimes. It is what it is.And we do deprecate and remove functionality when it’s superseded.-1 on unstable. It's way too many words than are needed. Three is amagic number and fits:PreviewBetaGAOn 11 Dec 2024, at 18:50, Josh McKenzie  wrote:A structured, disciplined approach to graduating something from [Optional] -> [Default] makes sense to me, similar to how we're talking about a structured flow of [Preview] -> [Beta] -> [GA]. Having those clear stages gives us a framework to define what requirements of stage transitions would be which'll ideally lead to us producing higher quality, more predictable, more consistent results for our end users.For instance, requirements from [Optional] -> [Default] could be higher level abstractions like:Confidence in stabilityStrong evidence to indicate superiority in majority of workloads (by count or importance or size, etc)These are all things we kind of do implicitly and ad-hoc on the mailing list, and I'm not looking to tie us down to any granular structure or specificity. More thinking it could be useful for someone that's worked on something who wonders "Huh. How do I take this from being optional to the default?" and having an answer better than "reinvent the wheel every time and fling spaghetti at the dev list and pray".:)On Wed, Dec 11, 2024, at 1:04 PM, Paulo Motta wrote:Thanks for bringing up this topic, Josh. Outside of the major features (ie. MV/SAI/TCM/Accord), one related discussion in this topic is: how can we "promote" small improvements in existing features from optional to default ?It makes sense to have optimizations launched behind a feature flag initially (beta phase) while the improvement gets real world exposure, but I think we need a better way to promote these optimizations to default behavior on a regular cadence.Take for example optimized repairs from CASSANDRA-16274. It was launched in 4.x as an optional feature gated behind a flag, ie. auto_optimise_full_repair_streams: false. I could be easily missing something, but is there a world where non-optimized repairs make sense once this optimization is proven to work ? I agree this is fine while the feature is maturing, but at some point we need to rip the bandaid and make the optimization default (and clearly communicate that). This would allow cleanup code toil of default behavior that is no longer being used, because everyone is ena

Re: Supporting 2.2 -> 5.0 upgrades

2024-12-12 Thread Jeremiah Jordan
>
> TL;DR - in progress migration off 2.2 to 5.0 is annoying as there were
>> different bugs in the past we have to support again.  Out of process
>> migration to me feels far more plausible, but feels annoying without
>> splitting off our reader/writer… doable… just more annoying…
>
>
This is your main blocker for the 5.0/trunk code converting 2.2 sstables
correctly.  You would have to bring back a bunch of code paths special
casing those versions and dealing with bugs in them.  If you are actually
interested in figuring out an offline 2.2 to 5.0 path I would recommend you
do it in two steps.  Offline 2.2 to 4.1? or what ever the latest thing is
that can sstableupgrade a 2.2 sstable, and then offline the result of that
with 5.0.  But at that point maybe you just do it online in those two steps.


On Dec 12, 2024 at 11:34:31 AM, Štefan Miklošovič 
wrote:

> I think it does not make a lot of sense to get away from Ant unless we
> split it into more jars. Splitting it into more jars while moving away from
> Ant at the same time is just too much work. So, what is the point of having
> monolithic cassandra-all in Gradle / Maven? Smoother release? We mastered
> that already. We are releasing, aren't we? Dependencies? That is working
> too. Sure cache and all the "hacks" would go away overnight but otherwise
> ... I think modularising it first so it is easier to reuse and so on is
> more important.
>
> On Thu, Dec 12, 2024 at 6:21 PM David Capwell  wrote:
>
>> but I still find myself very rarely interacting with ant
>>
>>
>> I think that is where most people are as not many actually maintain or
>> modify ant… there are so many things that bug me (lack of cache, making
>> sure new people use the right version (was totally fun to learn 4.1 didn’t
>> build with the default ant that got installed when I was helping out a new
>> higher… we had to downgrade… woohoo….), hand rolled IDE integration, etc…),
>> but I would disagree that ant is a blocker for different jars… we could
>> switch off ant for Makefile, or bash, and we would still be able to produce
>> new jars… Its more of… how many people actually feel comfortable enough to
>> alter our build system to make such a change?  I don’t…
>>
>> Our build system does not impact our ability to offer migration from 2.2
>> to 5.0… so don’t want to keep distracting this thread…
>>
>> TL;DR - in progress migration off 2.2 to 5.0 is annoying as there were
>> different bugs in the past we have to support again.  Out of process
>> migration to me feels far more plausible, but feels annoying without
>> splitting off our reader/writer… doable… just more annoying...
>>
>> On Dec 12, 2024, at 9:04 AM, Alex Petrov  wrote:
>>
>> > I have for a while advocated for a shared lib to also share between
>> Harry, accord, dtests etc
>>
>> Big +1 for a shared lib for our concurrency and test utils. Been
>> intending to start working on this for a while now, but never got to do
>> this so far.
>>
>> On Thu, Dec 12, 2024, at 5:58 PM, Benedict wrote:
>>
>>
>> Why would ant get in the way? We already build multiple jars, and accord
>> will be a submodule. We have far more organisational issues to overcome
>> than ant.
>>
>> I have for a while advocated for a shared lib to also share between
>> Harry, accord, dtests etc
>>
>> I am however not 100% sure about splitting read/write path, at least not
>> as first posited. The idea of maintaining it as an API for dropping in
>> different jars is a whole other world of potential pain I don’t want to
>> countenance. Supporting eg bulk readers or writers or other integrations
>> seems pretty feasible though.
>>
>>
>> On 12 Dec 2024, at 16:53, Paulo Motta  wrote:
>>
>> 
>> >  I think that will not happen until we are out of Ant as doing this
>> multi jar / subproject mumbo jumbo is not too much appealing to ... anybody?
>>
>> This is a contentious/controversial topic, but the more I work with
>> gradle the more I lean towards ant's simplicity. That said, I'd support
>> moving away if it becomes a technical blocker to break up cassandra-all -
>> and if this happen I would vote for maven as replacement. :-D
>>
>> On Thu, Dec 12, 2024 at 11:42 AM Miklosovic, Stefan via dev <
>> dev@cassandra.apache.org> wrote:
>>
>> These are all good ideas but in practical terms I think that will not
>> happen until we are out of Ant as doing this multi jar / subproject mumbo
>> jumbo is not too much appealing to ... anybody?
>>
>> 
>> From: Paulo Motta 
>> Sent: Thursday, December 12, 2024 17:35
>> To: dev@cassandra.apache.org
>> Subject: Re: Supporting 2.2 -> 5.0 upgrades
>>
>> EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
>>
>>
>>
>> >  +1 on moving the read/write logic into its own jar.
>>
>> +1, not only read-write logic but anything used by both the server and
>> subprojects (ie. cassandra-sidecar), for example JMX Mbeans and other
>> interfaces.
>>
>> I think one way to do that would be to split cassandra-a

Re: Supporting 2.2 -> 5.0 upgrades

2024-12-12 Thread Štefan Miklošovič
I think it does not make a lot of sense to get away from Ant unless we
split it into more jars. Splitting it into more jars while moving away from
Ant at the same time is just too much work. So, what is the point of having
monolithic cassandra-all in Gradle / Maven? Smoother release? We mastered
that already. We are releasing, aren't we? Dependencies? That is working
too. Sure cache and all the "hacks" would go away overnight but otherwise
... I think modularising it first so it is easier to reuse and so on is
more important.

On Thu, Dec 12, 2024 at 6:21 PM David Capwell  wrote:

> but I still find myself very rarely interacting with ant
>
>
> I think that is where most people are as not many actually maintain or
> modify ant… there are so many things that bug me (lack of cache, making
> sure new people use the right version (was totally fun to learn 4.1 didn’t
> build with the default ant that got installed when I was helping out a new
> higher… we had to downgrade… woohoo….), hand rolled IDE integration, etc…),
> but I would disagree that ant is a blocker for different jars… we could
> switch off ant for Makefile, or bash, and we would still be able to produce
> new jars… Its more of… how many people actually feel comfortable enough to
> alter our build system to make such a change?  I don’t…
>
> Our build system does not impact our ability to offer migration from 2.2
> to 5.0… so don’t want to keep distracting this thread…
>
> TL;DR - in progress migration off 2.2 to 5.0 is annoying as there were
> different bugs in the past we have to support again.  Out of process
> migration to me feels far more plausible, but feels annoying without
> splitting off our reader/writer… doable… just more annoying...
>
> On Dec 12, 2024, at 9:04 AM, Alex Petrov  wrote:
>
> > I have for a while advocated for a shared lib to also share between
> Harry, accord, dtests etc
>
> Big +1 for a shared lib for our concurrency and test utils. Been intending
> to start working on this for a while now, but never got to do this so far.
>
> On Thu, Dec 12, 2024, at 5:58 PM, Benedict wrote:
>
>
> Why would ant get in the way? We already build multiple jars, and accord
> will be a submodule. We have far more organisational issues to overcome
> than ant.
>
> I have for a while advocated for a shared lib to also share between Harry,
> accord, dtests etc
>
> I am however not 100% sure about splitting read/write path, at least not
> as first posited. The idea of maintaining it as an API for dropping in
> different jars is a whole other world of potential pain I don’t want to
> countenance. Supporting eg bulk readers or writers or other integrations
> seems pretty feasible though.
>
>
> On 12 Dec 2024, at 16:53, Paulo Motta  wrote:
>
> 
> >  I think that will not happen until we are out of Ant as doing this
> multi jar / subproject mumbo jumbo is not too much appealing to ... anybody?
>
> This is a contentious/controversial topic, but the more I work with gradle
> the more I lean towards ant's simplicity. That said, I'd support moving
> away if it becomes a technical blocker to break up cassandra-all - and if
> this happen I would vote for maven as replacement. :-D
>
> On Thu, Dec 12, 2024 at 11:42 AM Miklosovic, Stefan via dev <
> dev@cassandra.apache.org> wrote:
>
> These are all good ideas but in practical terms I think that will not
> happen until we are out of Ant as doing this multi jar / subproject mumbo
> jumbo is not too much appealing to ... anybody?
>
> 
> From: Paulo Motta 
> Sent: Thursday, December 12, 2024 17:35
> To: dev@cassandra.apache.org
> Subject: Re: Supporting 2.2 -> 5.0 upgrades
>
> EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
>
>
>
> >  +1 on moving the read/write logic into its own jar.
>
> +1, not only read-write logic but anything used by both the server and
> subprojects (ie. cassandra-sidecar), for example JMX Mbeans and other
> interfaces.
>
> I think one way to do that would be to split cassandra-all into
> cassandra-server and cassandra-common (anything used by both subprojects
> and server), but not sure if this would be feasible or what it would take.
>
> If there's loose agreement this would be a feasible path I'd be happy to
> create a JIRA to investigate what this would take.
>
> On Thu, Dec 12, 2024 at 11:26 AM Doug Rohrer  droh...@apple.com>> wrote:
> +1 on moving the read/write logic into its own jar.
>
> Doug
>
> > On Dec 11, 2024, at 7:21 PM, David Capwell  dcapw...@apple.com>> wrote:
> >
> > From a disk format point of view the only thing I remember was the disk
> type bug with UDTs.  Bringing that logic back was hard as the type system
> (in 5.0) tries to avoid allowing construction of invalid states, and we
> would need to weaken that in order to enable the migration. Assuming the
> user migrated from 3.x to 4.x then the sstable metadata should have been
> rewritten to fix this bug.
> >
> > One thought (though know its a ton of effort)..

Re: Supporting 2.2 -> 5.0 upgrades

2024-12-12 Thread Mick Semb Wever
On Fri, 13 Dec 2024 at 06:06, Jeremiah Jordan 
wrote:

> TL;DR - in progress migration off 2.2 to 5.0 is annoying as there were
>>> different bugs in the past we have to support again.  Out of process
>>> migration to me feels far more plausible, but feels annoying without
>>> splitting off our reader/writer… doable… just more annoying…
>>
>>
> This is your main blocker for the 5.0/trunk code converting 2.2 sstables
> correctly.  You would have to bring back a bunch of code paths special
> casing those versions and dealing with bugs in them.  If you are actually
> interested in figuring out an offline 2.2 to 5.0 path I would recommend you
> do it in two steps.  Offline 2.2 to 4.1? or what ever the latest thing is
> that can sstableupgrade a 2.2 sstable, and then offline the result of that
> with 5.0.  But at that point maybe you just do it online in those two steps.
>


Stepping through safe upgrade edges is currently a requirement, and I
suspect will often have value.

Something like `nodetool sstableupgrade --step 3.0,4.1,5.0`
 would be useful.

While it would take longer to do offline upgrades, it allows us to maintain
simpler code, and draw the lineations when we must.