Re: [VOTE] Release Apache Cassandra 4.1.1

2023-03-20 Thread Tommy Stendahl via dev
+1 (nb)

-Original Message-
From: Brandon Williams 
mailto:brandon%20williams%20%3cdri...@gmail.com%3e>>
Reply-To: dev@cassandra.apache.org
To: dev@cassandra.apache.org
Subject: Re: [VOTE] Release Apache Cassandra 4.1.1
Date: Fri, 17 Mar 2023 13:38:15 -0500


+1


Kind Regards,

Brandon


On Thu, Mar 16, 2023 at 3:11 AM Miklosovic, Stefan

<



stefan.mikloso...@netapp.com

> wrote:


Proposing the test build of Cassandra 4.1.1 for release.


sha1: 8d91b469afd3fcafef7ef85c10c8acc11703ba2d

Git:



https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.1.1-tentative


Maven Artifacts:



https://repository.apache.org/content/repositories/orgapachecassandra-1284/org/apache/cassandra/cassandra-all/4.1.1/



The Source and Build Artifacts, and the Debian and RPM packages and 
repositories, are available here:



https://dist.apache.org/repos/dist/dev/cassandra/4.1.1/



The vote will be open for 72 hours (longer if needed). Everyone who has tested 
the build is invited to vote. Votes by PMC members are considered binding. A 
vote passes if there are at least three binding +1s and no -1's.


[1]: CHANGES.txt:



https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.1.1-tentative


[2]: NEWS.txt:



https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.1.1-tentative


Re: [DISCUSS] CEP-26: Unified Compaction Strategy

2023-03-20 Thread Branimir Lambov
It seems I have created some confusion.

This version of UCS (let's call it V2) is ahead of the one in DSE (V1),
with the main difference that it no longer uses a fixed number of shards.
Because of this, V2 acts similar to LCS in the required extra space,
because the sstables it constructs aim to be close to a target size. V1 UCS
had some special features to deal with the large sstables it created in the
top levels of each shard, which are not as important for V2: when the
target size is small enough, there should be no need for limiting
compactions to the available free space, or for making sure that large
top-level compactions cannot cause sstables to accumulate on L0.

Because of this, such features of the V1 UCS have been omitted in order to
keep the initial commit small enough to fit the C* 5 timeline (they rely on
some sizable refactorings of the compaction interfaces which should come at
a later date).

Regards,
Branimir

On Sat, Mar 18, 2023 at 1:05 AM Jeff Jirsa  wrote:

> I’m without laptop this week but looks like CompactionTask#
> reduceScopeForLimitedSpace
>
> So maybe it just comes for free with UCS
>
>
> On Mar 17, 2023, at 6:21 PM, Jeremy Hanna 
> wrote:
>
> You're right that it doesn't handle it in the sense that it doesn't
> resolve it the problem, but it also doesn't do what STCS does.  From what
> I've seen, STCS blindly tries to compact and then the disk will fill up
> triggering the disk failure policy.  With UCS it's much less likely and if
> it does happen, my understanding is that it will skip the compaction.  I
> didn't realize that LCS would try to reduce the scope of the compaction.  I
> can't find in the branch where it handles that.
>
> Branimir, can you point to where it handles the scenario?
>
> Thanks,
>
> Jeremy
>
> On Mar 17, 2023, at 4:52 PM, Jeff Jirsa  wrote:
>
>
>
>
> On Mar 17, 2023, at 1:46 PM, Jeremy Hanna 
> wrote:
>
>
>
>
> One much more graceful element of UCS is that instead of what was
> previously done with compaction strategies where the server just shuts down
> when running out of space - forcing system administrators to be paranoid
> about headroom.  Instead UCS has a target overhead (default 20%).  First
> since the ranges are sharded, it makes it less likely that there will be
> large sstables that need to get compacted to require as much headroom, but
>  if it detects that there is a compaction that will violate the target
> overhead, it will log that and skip the compaction - a much more graceful
> way of handling it.
>
>
> Skipping doesn’t really handle it though?
>
>
> If you have a newly flushed sstable full of tombstones and it naturally
> somehow triggers you to exceed that target overhead you never free that
> space? Usually LCS would try to reduce the scope of the compaction, and I
> assume UCS will too?
>
>
>
>
>


CASSANDRA-18247 committed

2023-03-20 Thread Ekaterina Dimitrova
Hi everyone,

Happy Monday!

CASSANDRA-18247 was just committed. It adds testing CircleCI configuration
for JDK11+JDK17. Full description and how to use it can be found in
.circleci/readme.md.

Please let me know if you have any questions

Best regards,
Ekaterina


Cassandra project status, 2023-03-20

2023-03-20 Thread Josh McKenzie
I did say monthly-ish. That goes both earlier and later.

We've had a lot of interesting topics come up on the dev list in the past few 
weeks as well as movement on Accord, Transactional Metadata, and SAI, so let's 
get to it.

The Cassandra Forward event took place on March 14th with a lot of interesting 
talks and attendees (link: https://www.cassandrasummit.org/cassandra-forward). 
You can watch recordings of the different talks on the site as well as sign up 
for the Cassandra Summit that's been rescheduled to December 12-13th. Hope to 
see you there!


[New Contributors Getting Started]
We have a lot of great starter tickets to get started with if you're interested 
in diving in on the project - you can see the list on the kanban board w/quick 
filters here: 
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2454&quickFilter=2652&quickFilter=2162&quickFilter=2160.
 Anything on this list should be assignee-free and up for grabs so feel free to 
take a crack at them.

For assistance on getting set up or orientation working on the code, join 
#cassandra-dev channel on https://the-asf.slack.com (reply to me on this email 
if you need an invite for your account), and feel free to tag the 
@cassandra_mentors alias with questions.


[Dev mailing list]
https://lists.apache.org/list?dev@cassandra.apache.org:dfr=2023-3-3|dto=2023-3-20:

Have 26 threads this time - a bit more manageable but surprisingly busy for 18 
days.

We've had a lot of discussion about downgradability; Branimir originally 
brought the topic back up here: 
https://lists.apache.org/thread/tcp339k5ph8ql35wxr085to4qgp8tpg7, and that 
thread was still kicking since the last status update. Attempting a bit of 
editorializing, various points that were brought up and not contended:
- Try and not break sstable format compatibility with a change if it's 
reasonable not to
- Users should be able to opt-in to major format upgrades and not have access 
to new features until such time as they've opted in
- We should have an offline sstabledowngrade tool
- Nodes should be able to write older version sstables if configured to do so 
(how many versions and where that code lives is somewhat unclear still)
- We need simple tests (upgrade tests backwards) to see what works and doesn't 
work to know the scope of the problem
- Jacek created the epic https://issues.apache.org/jira/browse/CASSANDRA-18300 
to track work on downgradability

There's a good bit we discussed on the thread not yet captured in JIRA; 
assuming nobody has significant disagreement with the list above I may create 
tickets for the things we haven't yet captured so we don't lose that context. 
Also - if I missed something from that thread you brought up you want to see 
captured as well, let me know and I'll take care of that.

Another thread that's seen a lot of traffic without yet concluding: "[DISCUSS] 
Next release date" 
(https://lists.apache.org/thread/fncbr50xg1otw8xtpyn0b3ys02bfnwv1). It seems 
like we were headed towards a "set a target release date, back up N weeks based 
on how long we think it will take to validate that, and set that as our branch 
/ freeze date" conclusion. Jeremiah offered October with a potential September 
freeze if we believe ourselves capable of a 4 week validation, and David asked 
some pointed questions about why 4.1 took so long to release and whether we 
have enough testing to trust trunk today. If you have some thoughts on the 
topic, please don't let the thread lie dormant; it's important we come to a 
consensus on this and agree on a target to push for.

Stefan created and reminded us of CASSANDRA-18043, "remove deprecated 
DateTieredCompactionStratety". It's been deprecated for years now so it's 
probably time to go.

Speaking of deprecation, we've been discussing the role of the hadoop 
integration code in the codebase (link: 
https://lists.apache.org/thread/q34zsscctgn6kpwkflx03859y7nv3y5z). The general 
consensus appears to be for deprecation in 4.x and removal in 5.0 given the 
code is unmaintained and very, very old.

Stefan brought up the somewhat problematic case with NetworkTopologyStrategy 
where RF > number of racks, since the strategy can place things in a way where 
you lose QUORUM if you lose a rack (link: 
https://lists.apache.org/thread/dntymkm1b9xjs1bognf3w1lpf1mdrzos). The 
consensus on that thread was that we should make NTS do the right thing going 
forward but also preserve the ability to do things "the old way". See this JIRA 
for more details: https://issues.apache.org/jira/browse/CASSANDRA-16203

Bowen Song raised the topic of potentially enhancing how we handle disk errors: 
https://lists.apache.org/thread/gwyz9otgokqvmdrq85nw3ds5nyrhz8t3. Some 
interesting ideas came up on the thread as well as questions about what we 
could potentially do with the current state of the art vs. a future with 
transactional metadata. No conclusions quite yet but the notion of having 
replicas selectively reject t

Re: Cassandra project status, 2023-03-20

2023-03-20 Thread Miklosovic, Stefan
Thank you, Josh, for keeping writing these summaries.


From: Josh McKenzie 
Sent: Monday, March 20, 2023 20:34
To: dev
Subject: Cassandra project status, 2023-03-20

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.



I did say monthly-ish. That goes both earlier and later.

We've had a lot of interesting topics come up on the dev list in the past few 
weeks as well as movement on Accord, Transactional Metadata, and SAI, so let's 
get to it.

The Cassandra Forward event took place on March 14th with a lot of interesting 
talks and attendees (link: https://www.cassandrasummit.org/cassandra-forward). 
You can watch recordings of the different talks on the site as well as sign up 
for the Cassandra Summit that's been rescheduled to December 12-13th. Hope to 
see you there!


[New Contributors Getting Started]
We have a lot of great starter tickets to get started with if you're interested 
in diving in on the project - you can see the list on the kanban board w/quick 
filters here: 
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2454&quickFilter=2652&quickFilter=2162&quickFilter=2160.
 Anything on this list should be assignee-free and up for grabs so feel free to 
take a crack at them.

For assistance on getting set up or orientation working on the code, join 
#cassandra-dev channel on https://the-asf.slack.com (reply to me on this email 
if you need an invite for your account), and feel free to tag the 
@cassandra_mentors alias with questions.


[Dev mailing list]
https://lists.apache.org/list?dev@cassandra.apache.org:dfr=2023-3-3|dto=2023-3-20:

Have 26 threads this time - a bit more manageable but surprisingly busy for 18 
days.

We've had a lot of discussion about downgradability; Branimir originally 
brought the topic back up here: 
https://lists.apache.org/thread/tcp339k5ph8ql35wxr085to4qgp8tpg7, and that 
thread was still kicking since the last status update. Attempting a bit of 
editorializing, various points that were brought up and not contended:
- Try and not break sstable format compatibility with a change if it's 
reasonable not to
- Users should be able to opt-in to major format upgrades and not have access 
to new features until such time as they've opted in
- We should have an offline sstabledowngrade tool
- Nodes should be able to write older version sstables if configured to do so 
(how many versions and where that code lives is somewhat unclear still)
- We need simple tests (upgrade tests backwards) to see what works and doesn't 
work to know the scope of the problem
- Jacek created the epic https://issues.apache.org/jira/browse/CASSANDRA-18300 
to track work on downgradability

There's a good bit we discussed on the thread not yet captured in JIRA; 
assuming nobody has significant disagreement with the list above I may create 
tickets for the things we haven't yet captured so we don't lose that context. 
Also - if I missed something from that thread you brought up you want to see 
captured as well, let me know and I'll take care of that.

Another thread that's seen a lot of traffic without yet concluding: "[DISCUSS] 
Next release date" 
(https://lists.apache.org/thread/fncbr50xg1otw8xtpyn0b3ys02bfnwv1). It seems 
like we were headed towards a "set a target release date, back up N weeks based 
on how long we think it will take to validate that, and set that as our branch 
/ freeze date" conclusion. Jeremiah offered October with a potential September 
freeze if we believe ourselves capable of a 4 week validation, and David asked 
some pointed questions about why 4.1 took so long to release and whether we 
have enough testing to trust trunk today. If you have some thoughts on the 
topic, please don't let the thread lie dormant; it's important we come to a 
consensus on this and agree on a target to push for.

Stefan created and reminded us of CASSANDRA-18043, "remove deprecated 
DateTieredCompactionStratety". It's been deprecated for years now so it's 
probably time to go.

Speaking of deprecation, we've been discussing the role of the hadoop 
integration code in the codebase (link: 
https://lists.apache.org/thread/q34zsscctgn6kpwkflx03859y7nv3y5z). The general 
consensus appears to be for deprecation in 4.x and removal in 5.0 given the 
code is unmaintained and very, very old.

Stefan brought up the somewhat problematic case with NetworkTopologyStrategy 
where RF > number of racks, since the strategy can place things in a way where 
you lose QUORUM if you lose a rack (link: 
https://lists.apache.org/thread/dntymkm1b9xjs1bognf3w1lpf1mdrzos). The 
consensus on that thread was that we should make NTS do the right thing going 
forward but also preserve the ability to do things "the old way". See this JIRA 
for more details: https://issues.apache.org/jira/browse/CASSANDRA-16203

Bowen Song raised the topic of potentially 

Re: Cassandra project status, 2023-03-20

2023-03-20 Thread Berenguer Blasi

Hi,

I would add that CEP-20 DDM C17940 has made huge progress and nearing 
completion, all praise and glory to Andres. Also TTL c14227 has made big 
progress completing the first round of review and pending the sstable 
format/feature flag switch only so far.


Regards

On 20/3/23 21:30, Miklosovic, Stefan wrote:

Thank you, Josh, for keeping writing these summaries.


From: Josh McKenzie 
Sent: Monday, March 20, 2023 20:34
To: dev
Subject: Cassandra project status, 2023-03-20

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.



I did say monthly-ish. That goes both earlier and later.

We've had a lot of interesting topics come up on the dev list in the past few 
weeks as well as movement on Accord, Transactional Metadata, and SAI, so let's 
get to it.

The Cassandra Forward event took place on March 14th with a lot of interesting 
talks and attendees (link: https://www.cassandrasummit.org/cassandra-forward). 
You can watch recordings of the different talks on the site as well as sign up 
for the Cassandra Summit that's been rescheduled to December 12-13th. Hope to 
see you there!


[New Contributors Getting Started]
We have a lot of great starter tickets to get started with if you're interested in diving in on 
the project - you can see the list on the kanban board w/quick filters here: 
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2454&quickFilter=2652&quickFilter=2162&quickFilter=2160.
 Anything on this list should be assignee-free and up for grabs so feel free to take a crack at 
them.

For assistance on getting set up or orientation working on the code, join 
#cassandra-dev channel on https://the-asf.slack.com (reply to me on this email 
if you need an invite for your account), and feel free to tag the 
@cassandra_mentors alias with questions.


[Dev mailing list]
https://lists.apache.org/list?dev@cassandra.apache.org:dfr=2023-3-3|dto=2023-3-20:

Have 26 threads this time - a bit more manageable but surprisingly busy for 18 
days.

We've had a lot of discussion about downgradability; Branimir originally 
brought the topic back up here: 
https://lists.apache.org/thread/tcp339k5ph8ql35wxr085to4qgp8tpg7, and that 
thread was still kicking since the last status update. Attempting a bit of 
editorializing, various points that were brought up and not contended:
- Try and not break sstable format compatibility with a change if it's 
reasonable not to
- Users should be able to opt-in to major format upgrades and not have access 
to new features until such time as they've opted in
- We should have an offline sstabledowngrade tool
- Nodes should be able to write older version sstables if configured to do so 
(how many versions and where that code lives is somewhat unclear still)
- We need simple tests (upgrade tests backwards) to see what works and doesn't 
work to know the scope of the problem
- Jacek created the epic https://issues.apache.org/jira/browse/CASSANDRA-18300 
to track work on downgradability

There's a good bit we discussed on the thread not yet captured in JIRA; 
assuming nobody has significant disagreement with the list above I may create 
tickets for the things we haven't yet captured so we don't lose that context. 
Also - if I missed something from that thread you brought up you want to see 
captured as well, let me know and I'll take care of that.

Another thread that's seen a lot of traffic without yet concluding: "[DISCUSS] Next release 
date" (https://lists.apache.org/thread/fncbr50xg1otw8xtpyn0b3ys02bfnwv1). It seems like we 
were headed towards a "set a target release date, back up N weeks based on how long we think 
it will take to validate that, and set that as our branch / freeze date" conclusion. Jeremiah 
offered October with a potential September freeze if we believe ourselves capable of a 4 week 
validation, and David asked some pointed questions about why 4.1 took so long to release and 
whether we have enough testing to trust trunk today. If you have some thoughts on the topic, please 
don't let the thread lie dormant; it's important we come to a consensus on this and agree on a 
target to push for.

Stefan created and reminded us of CASSANDRA-18043, "remove deprecated 
DateTieredCompactionStratety". It's been deprecated for years now so it's probably 
time to go.

Speaking of deprecation, we've been discussing the role of the hadoop 
integration code in the codebase (link: 
https://lists.apache.org/thread/q34zsscctgn6kpwkflx03859y7nv3y5z). The general 
consensus appears to be for deprecation in 4.x and removal in 5.0 given the 
code is unmaintained and very, very old.

Stefan brought up the somewhat problematic case with NetworkTopologyStrategy where RF > 
number of racks, since the strategy can place things in a way where you lose QUORUM if you 
lose a rack (link: https://lists.apache.org/thread/dn