We have a town hall coming up! The URL for the meetup can be found here: https://www.meetup.com/cassandra-global/events/292858262/. This will be held tomorrow at 12pm EST.
Jon Haddad (https://www.linkedin.com/in/rustyrazorblade/) will be discussing performance tuning on Apache Cassandra, I'll be chatting about what's going on with the project leading up to 5.0, and Lorina will be covering how to get involved contributing to docs on the project. The full agenda can be found here: https://docs.google.com/document/d/14U4IGnKn8r7PPxF8Lc_leTcVfD8oW_p9oHW-BRbL3yY/edit#. Looking forward to seeing you there! Apache Cassandra 4.0.9 was released back on April 15th - see the release thread here: https://lists.apache.org/thread/ymr90v3l6fokwr885l1fsmfzr04tgpmn [New Contributors Getting Started] We've hand curated tickets we consider good to get started with on the project - check the list out here: https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2454&quickFilter=2652&quickFilter=2162&quickFilter=2160. We have 30 tickets for the next upcoming major to pick from there, so don't wait; shop now! Come hang out with us in the #cassandra-dev channel on https://the-asf.slack.com (reply to me on this email if you need an invite for your account), and reach out to the @cassandra_mentors alias with questions. [Dev mailing list] https://lists.apache.org/list?dev@cassandra.apache.org:dfr=2023-3-20|dto=2023-4-21: Up against 40 threads in the past month. Let's do this. We've had a long and glorious discussion about our next release date, when we're going to branch, when we're going to freeze, and what code goes where. https://lists.apache.org/thread/9c5cnn57c7oqw8wzo3zs0dkrm4f17lm3. There's no real resolution as yet on the thread but I think the tail end of it has a fairly clear summation of where we got (thanks to yours truly). With the caveat that some folks are very wary of a date-driven freeze date as our "drop dead date" for cutting the 5.0 branch. Jonathan Ellis came forward with a very promising prototype to add ANN vector-based search (as well as support for the data type) in Cassandra natively. There's a lot of interesting stuff going on in the ML space right now, and having the linear scale of Cassandra for a vectorized dataset could provide some really impressive scaling and augmentation characteristics for LLM's post-training and model building. Check out the thread here: https://lists.apache.org/thread/xl8shmknrrxp3w06s7byjlytn260781g Amit Parwal opened up a thread about using Direct I/O for the CommitLog with some very promising benchmarks. We opened up the following JIRA to track future collaboration on the topic (https://issues.apache.org/jira/browse/CASSANDRA-18464, email thread here: https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg) German Eichberger had some questions around how we as a project community intend to support branches that fall out of formal bugfix support in the event of CVE's. That can be found here: https://lists.apache.org/thread/owqlclzbq333dz68ryqw8z1md7s3fcmx. The loose consensus there seems to be that it's a very infrequent occurrance, we can cross that bridge when we come to it, and it'd be a valuable thing for a vendor to be able to step in and offer. The vote for CEP-28, Unified Compaction Strategy passed easily: https://lists.apache.org/thread/k5fg3mn43j701pdskpc1j8r1h9c20qk1. Excited to see this land in the project Branimir. Mike Adamson apparently walked directly into a third rail when it comes to how we name things, with the seemingly innocuous question as to whether we'd considered adding an alternative to "keyspace" in the form of "database". https://lists.apache.org/thread/9hf6x577ggf4r4lwss5jx22p8zy210b5. The TL'DR: we probably shouldn't, but we already support the usage of "schema" in its place. TIL (back when the thread hit...) Stefan Miklosovic reached out looking for more feedback on unifying the system properties and environment variables in the CassandraRelevantProperties and CassandraRelevantEnv classes. This is definitely one that's going to impact all of us so if you have either experience in the space and/or strong opinions on things you want heard before the trigger is pulled, now's your chance: https://issues.apache.org/jira/browse/CASSANDRA-17797 Maxim Muzafarov reached out around building a consensus regarding whether settings in vtables should be updatable. https://lists.apache.org/thread/z169kk31lzmvyor7pkwn2h17nor59bfq. This was on the tail end of a long thread on a related topic (https://lists.apache.org/list?dev@cassandra.apache.org:gte=1d:Allow%20UPDATE%20on%20settings), but I don't see that anyone engaged. Aleksey keeps up with the sisyphean task of pointing out that most, if not all, of our config fields shouldn't be volatile and we've been carrying along that pattern for reasons largely unknown. I commend you sir. And last and arguably least, two emails on the same topic: you have a new PMC chair: me! And you also shouldn't really get too worked up about it since pmc chairs are liaison's between the pmc and the ASF board and carry the same authority and responsibility for the health and well-being of the project as the rest of the pmc (see Rich Bowen's email here: https://lists.apache.org/list?dev@cassandra.apache.org:gte=1d:Allow%20UPDATE%20on%20settings). My .02: our community is in a healthy place (users, committers, and pmc members) and we all have a pretty clear understanding of what our various roles, responsibilities, and areas of focus and skill are. I appreciate all the well-wishes from everyone when Mick passed the baton, and I take the responsibilty of representing our pmc and our community to the broader Apache ecosystem seriously. [Checking in on CI] https://butler.cassandra.apache.org/#/ Ok April. What say you? 3.0: 11 -> 13 (very stable trendline) 3.11: 22 -> 9 (also very stable trendlines) 4.0: 8 -> 6 (bouncing all over the place) 4.1: 3 -> 4 (less stable than 4.0; makes sense as it's eeing action) trunk: 8 -> 10 (comparable to 4.1) On the whole, if I combine the above w/my 2 weeks as build lead, we're in a pretty good spot. Caught several actual new failures as build lead but the lion's share of failures remain due to the inconsistencies in execution environment. As a reminder, our blocker's in terms of CI in terms of release are: 1) Green CI on circleci or ASF infra 2) No regressions on ASF infra compared to previous branches So I think we're in a healthy place there, though the plan of record remains to move towards stabilizing ASF infra over the summer. [What's been closed out] Here's a custom quick filter to give us an overview in the last 30 days: https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2278 There's a lot of tickets in a raw JIRA query so I'm going to try out a new thing here and CSV out of JIRA w/a little massaging: [Missing Fix Version (naughty naughty)] NA : ( CASSANDRA-18452 ) BLOG - Announcing Monthly Apache Cassandra Town Halls NA : ( CASSANDRA-18062 ) On-disk string index with index building and on-disk query path NA : ( CASSANDRA-18378 ) CEP-15 (Accord) accord.messages.Defer rejects Recurrent retry of Commit NA : ( CASSANDRA-18418 ) CEP-21 Dereference TableMetadata in simple partition builder NA : ( CASSANDRA-18411 ) CEP-21 Improve support for start/end tokens in nodetool rebuild NA : ( CASSANDRA-18416 ) CEP-21 Ensure that global log replication factor is maintained after decommission NA : ( CASSANDRA-18417 ) CEP-21 Implement multi-dc placement simulator for NTS NA : ( CASSANDRA-18407 ) CEP-21 Remove paranoid check which fails due to pre-existing mismatch on replica vs query range NA : ( CASSANDRA-18414 ) CEP-21 Re-enable stdout/sterr redirection at startup NA : ( CASSANDRA-18419 ) CEP-21 During multi step operations, defer token map update until completion of final step NA : ( CASSANDRA-18415 ) CEP-21 Fix (re)building MVs NA : ( CASSANDRA-18404 ) CEP-21 Improve seedlist inspection at startup NA : ( CASSANDRA-18413 ) CEP-21 Add request failure reason to indicate invalid routing of a read or write NA : ( CASSANDRA-18405 ) CEP-21 Various fixes to schema related in-jvm dtests NA : ( CASSANDRA-18409 ) CEP-21 Make StorageService treatment of bootstrapping nodes consistent with previous implementation NA : ( CASSANDRA-18410 ) CEP-21 Fix nodetool ring and effective ownership NA : ( CASSANDRA-18406 ) CEP-21 Always use Paxos.v2 for global log reads/writes regardless of cluster configuration NA : ( CASSANDRA-18412 ) CEP-21 Secondary indexes should not be rebuilt on restart NA : ( CASSANDRA-18402 ) CEP-21 Add debounce to log replay NA : ( CASSANDRA-18408 ) CEP-21 Implement retries for log replay on CMS members NA : ( CASSANDRA-18403 ) CEP-21 Always populate local gossip state at startup NA : ( CASSANDRA-18203 ) CEP-15: (C*) Improve Burn Tests to include thread scheduling [5.x + 5.0: (change to 5.0?)] 5.x : ( CASSANDRA-18477 ) Do not require allow filtering when all primary keys are specified in SELECT 5.x : ( CASSANDRA-18377 ) CEP-15 (Accord) AsyncOperation can not fail as it has already reached FINISHED 5.x : ( CASSANDRA-18363 ) Test failure: cqlsh_tests.test_cqlsh.TestCqlsh.test_list_queries 5.x : ( CASSANDRA-17341 ) Merge guardrails and track warnings configurations 5.0 : ( CASSANDRA-18471 ) CEP-15 Accord: NotWitnessed commands can receive an invalidate promise but would return Zero instead 5.0 : ( CASSANDRA-18430 ) When decommissioning should set Severity to limit traffic 5.0 : ( CASSANDRA-17869 ) Add JDK17 option to cassandra-builds (build-scripts and jenkins dsl) and on jenkins agents 5.0 : ( CASSANDRA-17364 ) dependency on commons-io is to 2.6 which has a CVE 5.0 : ( CASSANDRA-18395 ) Rename internal state() method in AbstractFuture to not conflict with Java 19 changes 5.0 : ( CASSANDRA-18437 ) Fix org.apache.cassandra.transport.MessagePayloadTest-.jdk17 5.0 : ( CASSANDRA-18422 ) CEP-15 (Accord) Original and recover coordinators may hit a race condition with PreApply where reads and writes are interleaved, causing one of the coordinators to see the writes from the other 5.0 : ( CASSANDRA-18373 ) Node Draining Should Abort All Current SSTables Imports 5.0 : ( CASSANDRA-18431 ) Cassandra doesn't start on JDK17 5.0 : ( CASSANDRA-18037 ) Use snake case for the names of CQL native functions 5.0 : ( CASSANDRA-18375 ) CEP-15 (Accord) Expected reply message with verb ACCORD_INFORM_OF_TXNID_RSP but got ACCORD_SIMPLE_RSP 5.0 : ( CASSANDRA-18262 ) Switch checkstyle running only with JDK8 to be run with JDK11 5.0 : ( CASSANDRA-18372 ) A link to the pull request can be attached to the JIRA issue as soon as the PR is raised 5.0 : ( CASSANDRA-17199 ) Provide summary of failed SessionInfo's in StreamResultFuture 5.0 : ( CASSANDRA-17940 ) CEP-20: Dynamic Data Masking 5.0 : ( CASSANDRA-18343 ) JDK17 - fix nodetool_test.TestNodetool.test_sjk 5.0 : ( CASSANDRA-18102 ) Add a virtual table to list snapshots 5.0 : ( CASSANDRA-18049 ) Update Chronicle Queue 5.0 : ( CASSANDRA-18323 ) Remove org.apache.cassandra.hadoop code 5.0 : ( CASSANDRA-18247 ) Add CircleCI config files for J11+J17 [4.1 line and up] 4.1.x : ( CASSANDRA-17758 ) URL for KEYS in Debian installation returns a redirect 4.1.2 : ( CASSANDRA-18124 ) Config parameter keystore_password should be nullable 4.1.2 : ( CASSANDRA-18267 ) keepbrief is not called 4.1.2 : ( CASSANDRA-18371 ) Snapshots with dots in their name are not returned in listsnapshots 4.1.2 : ( CASSANDRA-18359 ) NullPointerException on SnapshotLoader.loadSnapshots 4.1.2 : ( CASSANDRA-18304 ) hinted_handoff_enabled=false is not honored 4.1.2 : ( CASSANDRA-18353 ) Cqlsh command "COPY … TO STDOUT" fails with "… object is not callable" 4.1.2 : ( CASSANDRA-18354 ) Remove obsolete 'six' package reintroduced by a merge 4.0.x : ( CASSANDRA-17844 ) DOC - Fix typo in CQL types page 4.0.9 : ( CASSANDRA-18429 ) Upgrade Zstd to 1.5.5 4.0.9 : ( CASSANDRA-18332 ) Backport CASSANDRA-17205 to 4.0 branch (strong ref leak) 4.0.9 : ( CASSANDRA-18131 ) LongBTreeTest times out after btree improvements from CASSANDRA-15510 4.0.9 : ( CASSANDRA-18370 ) BulkLoader tool initializes schema unnecessarily via streaming - 4.0 4.0.9 : ( CASSANDRA-17980 ) Fix flaky Python DTest materialized_views_test.TestMaterializedViews.test_resume_stopped_build 4.0.10 : ( CASSANDRA-18474 ) Incremental repairs fail on mixed IPv4/v6 addresses serializing SyncRequest 4.0.10 : ( CASSANDRA-18443 ) Deadlock updating sstable metadata if disk boundaries need reloading 4.0.10 : ( CASSANDRA-17913 ) Nested selection of reversed collections fails 4.0 : ( CASSANDRA-9362 ) Native protocol v5 [3.0 + 3.11 and up] 3.11.15 : ( CASSANDRA-18448 ) Missing "SSTable Count" metric when using nodetool with "--format" option 3.11.15 : ( CASSANDRA-16906 ) DOC - Formulae in Evaluating DM page does not display correctly 3.11.15 : ( CASSANDRA-18389 ) jackson-core-2.13.2.jar vulnerability: CVE-2022-45688 3.0.29 : ( CASSANDRA-18396 ) Dtests marked with @ported_to_in_jvm can be skipped since 4.1 3.0.29 : ( CASSANDRA-18249 ) Docker image for releases 3.0.29 : ( CASSANDRA-18391 ) consistent timeout: dtest-upgrade.upgrade_tests.cql_tests.cls.test_cql3_non_compound_range_tombstones on trunk 3.0.29 : ( CASSANDRA-18153 ) Memtable being flushed without hostId in version "me" and newer during CommitLogReplay 3.0.29 : ( CASSANDRA-18156 ) Test Failure: repair_tests.deprecated_repair_test.TestDeprecatedRepairNotifications.test_deprecated_repair_error_notification 3.0.29 : ( CASSANDRA-17701 ) Failing test: TestRepair.test_failure_during_validation : ( CASSANDRA-18256 ) Backport CASSANDRA-17205 (remove strong self-ref in tidier) to all supported lines : ( CASSANDRA-18368 ) Redhat 40x repo signature is not valid : ( CASSANDRA-18356 ) Cassandra Debian Repository is not available and will be redirected to landing.jfrog.com That really is an awful lot of tickets. The kanban board doesn't show near as many so I'm going to dig into that a bit and see where the discrepancy is. All right folks - see you on slack, the ML, and the town hall tomorrow! ~Josh