First off - Congrats again to Sumanth Pasupuleti on becoming a committer on the project! Well deserved; looking forward to working with you further.
It looks like ponymail got an upgrade; I didn't even realize that was possible at this point. :) So caveat emptor: the links I put in here to individual email threads are different than in the past but appear to be working. [New contributors getting started] There's been some discussion about whether the #cassandra-dev channel with 600 people in it is the best place for new contributors to get involved and publicly ask beginner questions or whether we should start a new channel with a somewhat more limited scope. Please chime in on that dev mailing list thread if you have an opinion: https://lists.apache.org/thread/x8fx9b22nfll3gd40w4o971cyznckxrz As a new contributor we recommend starting in one of two places: Failing tests, or starter tickets we label "lhf" (low hanging fruit). Query for failing tests: https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=496&quickFilter=2252 Query for unassigned starter tickets: https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2162&quickFilter=2160 We're up from 18 unassigned test failures to 22 in the past couple of weeks. David Capwell, Berenguer Blasi, and Ekaterina Dimitrova (and others!) have been doing some great work both surfacing failures as well as fixing things - thank you! For unassigned lhf, we're up from 10 to 11 on 4.0.2 (our next minor release) and up from 13 to 14 on 4.1.0 (our next major release). Feel free to self-select from that list, hit up this email thread or list if you want some guidance on where to get involved, ping in the #cassandra-dev slack channel on the-asf.slack.com server, or email or message me directly if you want any help. [Dev list discussions in the past 14 days] https://lists.apache.org/list?dev@cassandra.apache.org:lte=2w: We have an ongoing discussion about what it means to have a releasable trunk and what steps, if any, it'd take to get there. Given the scale and complexity of this project and its testing infrastructure, I'm curious to hear what other experiences people have had with applying select CI and CD principles to an ecosystem like this: https://lists.apache.org/thread/kyyo5k3my2nx160mfgy0xkwo8xjh2qpv As mentioned above, there's an ongoing discussion about how to make the cassandra dev community more welcoming for newcomers: https://lists.apache.org/thread/x8fx9b22nfll3gd40w4o971cyznckxrz Andres surfaced CEP-3 for guardrails in which we all professed our continued love for JMX (especially you Patrick). It'd be great to see more operators chime in with their experience running clusters at scale and the type of anti-patterns of usage that destabilize clusters since guardrails would be a great way to expose protection against frequently occurring patterns that scales poorly, among other things (tombstone heavy workloads and thousands of tables anyone?) CEP-18: Improving Modularity is going to be deprecated in favor of module-specific refactors and optional implementations. CEP-17: SSTable format API is evolving nicely: https://lists.apache.org/thread/boqb5trkq1q38rmb50p4lsw95hyv053m And these are just the highlights! [Tickets in the past 14 days] On the 4.0.2 front we've closed out 5 tickets compared to 9 in the prior 2 weeks. Looks like permissions, some timeouts during replica failure, website updates, etc. For 4.1.0 we've closed out 8 issues down from 14. Some stability in schema pulls, commit log stability during testing, a slew of test fixes, and a new feature to allow denying access to configured partition keys for reads, writes, or range reads based on config (CQL or JMX). [Tickets that need attention] Needs Reviewer: https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&selectedIssue=CASSANDRA-16547&quickFilter=2259 I've tidied up / created a new quick filter that's tickets that are in progress, blocked, or patch available but lacking a reviewer. This is slightly opinionated of me in that it implies we should have reviewers for things as we work on them rather than once they're further along being written; I have a bias towards early inclusion of a 2nd pair of eyes and a sounding board. If you see anything on this list that you're qualified to review on or know the area of the code-base and have a few cycles, please take a look and help out. Workload wise, 14 tickets on 4.0.2 need reviewers and 34 on 4.1.0 by this definition. I'm going to refrain from linking to stalled tickets (30d inactive) for now; the load of that is high (80 on 4.0.2, 422 on 4.1.0) so we probably should approach this a little differently if we want to tidy up or prune that backlog. It's as simple as a fixversion flag so doesn't really indicate _too_ much to worry about. [Test Failure Trendlines] So first off, we have a good number of tests in this project. 43,000 or so now. It's helpful to keep that in mind when we talk about having 5, 10, or even 50 test failures relative to the total corpus. Unfortunately, databases are like compilers in that they're rather unforgiving of even a .125% failure rate. So what's our test failure trend? We have 2 trendlines of interest: 1) The documented JIRA-ticket created test failures on the project: https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=496&view=reporting&chart=cumulativeFlowDiagram&swimlane=1233&swimlane=1234&column=2195&column=2196&column=2197&days=90 We can see where I got feisty creating test failure tickets when trying to merge the Denylist patch a week ago. In general, the volume of "open tickets for known test failures" has been growing: https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=496&view=reporting&chart=cumulativeFlowDiagram&swimlane=1233&swimlane=1234&column=2195&column=2196&column=2197&days=90 That said, this could be due to a variety of factors: more failures, increased discipline around tracking, or even poor hygiene closing out tickets when we fix the related tests. 2) The metric that I think is a bit cleaner and more informative is our test failure history on our jenkins build server (assuming I can ever get it to load /groan): https://ci-cassandra.apache.org/job/Cassandra-trunk/lastCompletedBuild/testReport/history/ In general we've been pretty clean (meaning single digit failures) since the 4.0 release; as discussed in another thread, the recent spate of failures caused by dtest-api dependency changes is being addressed in CASSANDRA-17050. Silver lining: that situation has surfaced 1) a need for a discussion and improvement around how we work with dependent projects and release dependencies in Cassandra (all in one IDE as subprojects vs. separate projects, release dependencies, etc) and we can expect to see a DISCUSS thread about that soon, and 2) that there's broader failures going on with some of the python dtests for a bit here we need to get to the bottom of. And that's a wrap folks. I call this one "The Calm Before the Storm" if our CEP's are any indicator. :) As always, thanks everyone for the time, effort, and collaboration on the project. ~Josh