This is the Special "We need to talk" edition. :) Something interesting changed in the past two weeks - we had our first couple of rotations of a Build Lead ( https://cwiki.apache.org/confluence/display/CASSANDRA/Build+Lead).
And why do we need to talk? Well, Brandon and I created a lot of test failure tickets. And by "A lot", I mean 42: https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20reporter%20IN%20(jmckenzie%2C%20brandon.williams)%20AND%20created%20%3E%20-14d If you take a look at what's going on in Butler, you'll see that for 3.0, 3.11, and 4.0 our test failure rates are either increasing or holding steady with a current total of 82 test failures between these three versions. If we assume that all of those failures are duplicates (generous of us), that still leaves us with a consistent 27 test failures on each branch. This number of test failures effectively leaves us holding our noses and merging with current non-test fixing changes, slowly worsening an already messy situation. For what it's worth, if we include trunk things get differently murky. On the plus side we only have 16 failures there today, whereas on the downside that's "for today" and test runs on trunk can't seem to make up their mind, ranging from a low of 10 failures to highs of 49 and 67. So what can we do about this? Well, if we had only 15 active contributors (undershooting to illustrate the point) and each of them took 2 test failures each week for the next 2 weeks, that'd be enough to drive down most if not all of the failures across 3.0, 3.11, and 4.0. It's important that we keep a clean test board because when things like security CVE's or data loss defects come along, we need to be able to cut a quick hotfix release without worrying about whether we're introducing new regressions into critical production systems running GA release lines. It's hard to overstate how critical this is to us as a project. So in short, the outstanding question we as a project haven't tackled yet is: how are we going to resource fixing these tests now that we have them wired up in butler and JIRA and have them identified? [New contributors] Did you know fixing failing tests is a great way to get to know the Cassandra codebase? :) This is actually in all seriousness, not in jest due to what's above. Tests can be tricky, interesting, and quite educational if you're opening up an area of the codebase you haven't worked on before, and you can always hit up @cassandra_mentors in the #cassandra-dev channel on the ASF slack server here: https://the-asf.slack.com For convenience, here's a link to a kanban board of the currently identified failing test JIRA's that haven't been assigned yet: https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=496&quickFilter=2252 64 of them! A veritable cornucopia of interesting work. If test failures aren't your bag, we have 14 tickets unassigned that are solid starter tickets on the 4.0.x line, and 14 on the 4.x line you could tackle: https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2162&quickFilter=2160 [Dev mailing list conversations] https://lists.apache.org/list?dev@cassandra.apache.org:lte=2w: The last two weeks have seen momentum pick up a bit on the dev list. Some highlights to get engaged with: The trie-memtable thread got some very interesting updating today; Branimir attached some performance numbers from the implementation compared to our current skip list: https://lists.apache.org/thread/fdvf1wmxwnv5jod59jznbnql23nqosty Ekaterina landed CASSANDRA-15234 to standardise our config and JVM parameters. This is a significant achievement and a ton of work to get across the line - congrats Ekaterina! Email thread here: https://lists.apache.org/thread/qf4ctv1067hz5j0pm6wc75rr44kospk4 One thing to be aware of is around the follow-up Ekaterina sent to the list about our test suite misbehaving: https://issues.apache.org/jira/browse/CASSANDRA-17351. The ever lurking zombie conversation about ant vs. maven vs. gradle has risen from the dead again: https://lists.apache.org/thread/jksl415lvfmrnh7z7xvy41v3d25twc5w. We've never really put a bow on this in the past and traditionally the conversation fizzles out; the outstanding request from several of us today is a clear enumeration of pros vs. cons, value vs. cost for each of the different build systems so we can either make a decision as a project on this or agree to put it to rest and not revisit it for a set amount of time. To whomever decides to take up that torch, know that there are hordes of people ready to share their opinions about their favored build system with you (not sure if this is encouraging or not =/). SharanF opened up an interesting and much-needed (editorializing alert) thread about non-Java-code contributing committers on the project here: https://lists.apache.org/thread/mlqqxcmyz60fd8mzn66nslp5nxlnryld. The overloading of the term to mean both "someone with commit bit who commits code" and "someone who is committed to the project" is something we've stumbled upon in the past; would love to hear what people think on the topic. (Reference community.apache.org article on committers here: https://community.apache.org/contributors/) And last but not least, I'd like to call attention to the interesting discussions going on around Storage Attached Indexes (SAI) and including OR support in the initial CEP or not: https://lists.apache.org/thread/50t6p19s4c05wo1s5j510l195t5n6s10 [Development velocity] We've closed out 6 issues on the 4.0.x line: https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2175 Some highlights include some broad fixes to intermittent in-jvm dtest failures, cleaning up a bit in the PasswordObfuscator, and some packaging and documentation. All just the kind of minor polishing changes we love to see on a low ordinal GA release. We had 14 issues closed out on the 4.x line with some highlights being significant increases in VIntCoding speed (C-15215), Standardizing config and JVM parameters mentioned above (C-15234), the removal of Windows specific classes to clean up some vestigial bits in the codebase (bittersweet for me, that one: C-16956), and some general tidying around old python versions and test fixes. [CI status] See above. It's rough, but it's nothing we can't fix if we put our minds to it. And that about covers it for today - thanks everyone for reading and for all your contributions on the project! ~Josh