Been a bit over a month; let's check in and see how things are looking. We released the following: - 3.11.15 - 3.0.29 - 4.0.10 - 4.1.2
Thanks to all the release managers who worked on getting these out the door. [New Contributors Getting Started] First off, come hang out with us in the #cassandra-dev channel on https://the-asf.slack.com (reply to me on this email if you need an invite for your account), and reach out to the @cassandra_mentors alias with any questions about the code. We have a list of hand-curated "starter tickets" available here: https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2160&quickFilter=2162. Anything in the "ToDo" column is a great candidate to pick up if you want to get your feet wet with the project. Some other useful links: Getting Started with Development on C*: https://cassandra.apache.org/_/development/gettingstarted.html Building and IDE integration (worktrees are your friend): https://cassandra.apache.org/_/development/ide.html Code Style: https://cassandra.apache.org/_/development/code_style.html [Dev mailing list] https://lists.apache.org/list?dev@cassandra.apache.org:dfr=2023-4-25|dto=2023-5-31: 52 threads since last email. I have made a mistake in waiting this long. ;) Jonathan Ellis' thread on vector search reached a conclusion, a follow up discussion about API's took place, and a CEP was proposed, voted upon, and passed! Phew. Thread: https://lists.apache.org/thread/16lc6d02xsfvlvqgn3ooy53pgfddyglc Proposal on adding a new type for vector search: https://lists.apache.org/thread/0lj1nk9jbhkf1rlgqcvxqzfyntdjrnk0 Poll on syntax: https://lists.apache.org/thread/lkowo1qkxjb5wc3n8v6ov4f0r538h13c CEP-30 proposal: https://lists.apache.org/thread/v32tgofo0w47bl7stbb9141obfbg5r0x CEP-30 vote: https://lists.apache.org/thread/7s581j65wtst6968c86hzncbnzrr09oj Congratulations to Jonathan and everyone else involved for getting that up and running and driving to consensus that rapidly. Speaking of CEP's, the patch for CEP-28 (Spark bulk writer / reader) via the sidecar was posted: https://lists.apache.org/thread/7pwvlwkg49qm72xnlf0m322fy4fmvxk3. Doug Rohrer also called a vote for CEP-28 and it passed as well! https://lists.apache.org/thread/7kndoo6rjchrlk41hbl8v7sclkvdzkgt Congrats to Doug and everyone else who collaborated on that effort as well. Quite a month for us as a project! Maxim Muzafarov keeps fighting the good fight on vtables and updating running configurations: https://lists.apache.org/thread/gdtr3vp375d3nyj6h8xo7owth1s556lz. Jakub's working on getting the ant target for generate-idea-files to behave with JDK17: https://lists.apache.org/thread/o2fdkyv2skvf9ngy9jhpnhvo92qvr17m. Looks like he has a few reviewers on the ticket but if you're curious you can find that here: https://issues.apache.org/jira/browse/CASSANDRA-18467 Discussion around CEP-29 (CQL NOT operator) continued: https://lists.apache.org/thread/cl4d7yo9q6ygnqstk8hhgm597ywg69d1 And was voted upon: https://lists.apache.org/thread/rwxc8y0c8johrhqcpxsdkns85rop0fxg and passed! Congratulations Piotr and crew on that; that's a feature I'm sure a lot of our users will appreciate. Claude Warren has a PR open working with an SSTableDowngrader tool: https://lists.apache.org/thread/wvb8c5svvyvny0b61ybbw0jvxxflog4p. The PR can be found here: https://github.com/apache/cassandra/pull/2045, and this is in relation to the C* JIRA issue https://issues.apache.org/jira/browse/CASSANDRA-8928. A new release of the in-jvm dtest API went out: https://lists.apache.org/thread/tsn70ox1th1x2vcsc7kfky9jsv1foq61 Maxim Muzafarov reached out to let everyone know about the migration of properties into the CassandraRelevantProperties class: https://lists.apache.org/thread/3g5g5kmk64m54qlyhpmdvxcw8m2vsytz. I'm very happy SonarLint will stop yelling at me about this class of warnings going forward. :) With SAI appearing as well as ANN Vector search, the topic of how we handle our CREATE INDEX DDL came up courtesy of Caleb Rackliffe: https://lists.apache.org/thread/4jxq1tghvb10f848q5vkq241w39lyw57. Looks like we've managed to distill things down to something we can wrangle to consensus: https://lists.apache.org/thread/oswfj6rsq298dfffw3yzy12q82ybczn7 Our usage of FixVersion continues to evolve: https://lists.apache.org/thread/5ompnd3l76kpwc831h80o1jd1g87dcgy. This thread came up around what FixVersion we apply to tickets that are sub-tasks of epic's for approved CEP's that may or may not land in a major. Since we don't know if they're going to be done by the hard cutoff for 5.0 for instance, 5.0 as a release version would be incorrect. And since 5.X is historically reserved for "5.0-targeting but not yet merged", we end up in a bind there. Benedict definitely brought me around to the approach of having: FIXVERSION = 5.0-target, and upon merge of the parent epic we can update all children tickets to whatever the parent has. No real strong consensus yet on the thread but moving away from something that requires explaining (5.X) to something that's immediately apparent (5.0-target) seems like an easy win to me. :) Jonathan Ellis reached out about agrona vs. fastutils, and we have some moderately strong feelings about libraries on the project: https://lists.apache.org/thread/bh4cd8z367rz18cr6gk4nw8z2kd0bl32. I just did some brief googling on agrona vs. fastutil benchmarks and I think I came away more confused than when I started. I'm sympathetic to Aleksey waxing poetic about the beauty of Agrona, but also remain somewhat sad we don't have some kind of documented or more discoverable place / way to see "here's what's available in the code-base; here's what libraries we depend on". Maybe someone will train an open-source LLM on the C* code-base and make it available as-a-service for folks on the project to chat with to find such things in the future... Derek Chen-Becker proposed servers be able to send a SUPPORTED message for graceful auth selection by clients indicating which authenticators are supported: https://lists.apache.org/thread/lvolm2r6q9q39v7rrzl08vwt86xb0yof. Much like myself, he is a wizard with names and led with the incredibly bold: "MyAwesomeAuthenticator". ;) That thread led to Jacek chiming in with CEP-31: https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-31+%28DRAFT%29+Negotiated+authentication+and+authorization, Negotiated authentication and authorization. If you have experience, thoughts, or curiosity in this space, chime in. (Jacek also formally brought it up for discussion in its own thread here: https://lists.apache.org/thread/z83zzp83lxfx11b7mf45vtdzs7dtryzb) Stefan Miklosovic has been digging around in cassandra-stress and has some questions about whether simplenative is still relevant: https://lists.apache.org/thread/1jsfx8vz769rb7sgqkh6pw57d6gs7wzh. There's a JIRA following up on this here: https://issues.apache.org/jira/browse/CASSANDRA-18529; could lead to some nice cleanup in stress if adopted. And last but not least, I went ahead and kicked the hornet's nest about where some things in our ecosystem live, whether they should be submodules or not, and how we can get better adoption. This time, the topic was cassandra-harry: https://lists.apache.org/thread/tzmwt6ngc4wykop3j2f2vwg16j062vgt. It seems like we're narrowing down on a general lazy-consensus worthy stance of bringing it in-tree. Abe summed up the state of things here pretty well: "This topic has come up quite a few times recently - around shared utilities (CEP-10 concurrency primitives, etc), dtest-api, query parser, etc. The project has tried out a few different approaches on composition of separate projects. Hopefully in the near future we find the one that works best and can start this process of splitting out libraries." I'm personally still bullish on us refactoring out a shared library with general collections and non-C* specific primitives everything in the ecosystem can rely on (drivers, accord, integrations, sidecar, etc) that we can also more easily reference to see what we have available to work with. As our project continues to grow, the discoverability problem grows with it. [Checking in on CI] https://butler.cassandra.apache.org/#/ May has already passed us by. Since April 25th... 3.0: 13 -> 7. Some bouncing around. 3.11: 9 -> 11. High water mark just short of 30, but mostly around the 20 mark. 4.0: 6 -> 14. This one's had a tighter band around 8 failures but spioked to ~70 somewhere in there. 4.1: 4 -> 4. Definitely higher amplitude than 4.0, but mostly looks to be averaging 6-ish failures. trunk: 10 -> 15. This one's had some more interesting spikes but averaging around maybe 13? 12? failures. We're keeping things together; I expect the changes that come along with https://issues.apache.org/jira/browse/CASSANDRA-18133 (In-tree build scripts) and the general work on Repeatable ci-cassandra.a.o on https://issues.apache.org/jira/browse/CASSANDRA-18137 will have a positive impact on this as we lead up to release season. [What's been closed out] QuickFilter remains accurate: https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2278 Closed in the last month: 5[Bug](CASSANDRA-18436)Unit tests in org.apache.cassandra.cql3.EmptyValuesTest class occasionally failing with JDK17(djatnieks) 5[Bug](CASSANDRA-18270)ssl-factory demo in examples is broken(maulin.vasavada) 5[Bug](CASSANDRA-18471)CEP-15 Accord: NotWitnessed commands can receive an invalidate promise but would return Zero instead(dcapwell) 5[Improvement](CASSANDRA-18544)Make cassandra-stress able to read all credentials from a file(smiklosovic) 5[Improvement](CASSANDRA-15046)Add a "history" command to cqlsh. Perhaps "show history"?(bschoeni) 5[Improvement](CASSANDRA-18441)Improvements to SSTable format configuration(jlewandowski) 5[Improvement](CASSANDRA-18525)Add keyspace column to clients virtual table(n.v.harikrishna) 5[Improvement](CASSANDRA-17797)All system properties and environment variables should be accessed via the new CassandraRelevantProperties and CassandraRelevantEnv classes(mmuzaf) 5[Improvement](CASSANDRA-18485)CEP-15: (C*) Enhance in-memory FileSystem to work with mmap and support tests to add custom logic(dcapwell) 5[Improvement](CASSANDRA-8720)Provide tools for finding wide row/partition keys(adelapena) 5[Improvement](CASSANDRA-16855)Replace minor use of `json-simple` with Jackson(smiklosovic) 5[Improvement](CASSANDRA-18449)Integer(int), Long(long), Float(double) were deprecated in JDK9(brandon.williams) 5[Improvement](CASSANDRA-18364)CEP-15: (C*) Accord message processing should avoid being passed on to a Stage and run directly in the messageing handler(dcapwell) 5[Improvement](CASSANDRA-18430)When decommissioning should set Severity to limit traffic(dcapwell) 5[New Feature](CASSANDRA-18500)Add guardrail for partition size(adelapena) 5[New Feature](CASSANDRA-18352)Add Option to Timebox write timestamps(jwest) 5[Task](CASSANDRA-18453)Use WithProperties to ensure that system properties are handled(bernardo.botella) 5[Task](CASSANDRA-18519)CEP-15: (C*) Add notion of CommandsForRanges and make this durable in C*(dcapwell) 4.1: 4.1.2[Bug](CASSANDRA-18047)fix flaky o.a.c.distributed.test.PaxosRepair2Test.paxosRepairHistoryIsntUpdatedInForcedRepair(jonmeredith) 4.1.2[Improvement](CASSANDRA-18482)Test Failure: HintsDisabledTest.testHintedHandoffDisabled(jonmeredith) 4.0: 4.0.10[Bug](CASSANDRA-18507)Partial compaction can resurrect deleted data(toblin) 4.0.10[Bug](CASSANDRA-16718)Changing listen_address with prefer_local may lead to issues(brandon.williams) 4.0.10[Bug](CASSANDRA-17918)DESCRIBE output does not quote column names using reserved keywords(smiklosovic) 4.0.10[Bug](CASSANDRA-18505)NPE when deserializing malformed collections from client(jonmeredith) 4.0.10[Improvement](CASSANDRA-18550)Improve nodetool enable{audit,fullquery}log, CVE-2023-30601(marcuse) 4.0.10[Improvement](CASSANDRA-18400)Nodetool info should report on the new networking cache(qannap) 4.0.10[Improvement](CASSANDRA-18260)Add details to Error message: Not enough space for compaction(henrik.ingo) 4.0.10[Improvement](CASSANDRA-18474)Incremental repairs fail on mixed IPv4/v6 addresses serializing SyncRequest(jonmeredith) 3.0/3.11: 3.0.29[Bug](CASSANDRA-18552)Debian packaging source should exclude git subdirectory(mck) 3.0.29[Bug](CASSANDRA-18472)Docker images can no longer be built due to virtualenv from pip(brandon.williams) 3.0.29[Bug](CASSANDRA-18497)snakeyaml vulnerability: CVE-2023-2251(brandon.williams) 3.0.29[Bug](CASSANDRA-18336)Do not remove SSTables when cause of FSReadError is OutOfMemoryError while using best_effort disk failure policy(smiklosovic) 3.0.29[Bug](CASSANDRA-18105)TRUNCATED data come back after a restart or upgrade(smiklosovic) 3.0.29[Improvement](CASSANDRA-18401)Investigate preloading ccm repositories in the docker image(brandon.williams) 3.0.29[Improvement](CASSANDRA-14319)nodetool rebuild from DC lets you pass invalid datacenters(smiklosovic) 3.0.30[Bug](CASSANDRA-17302)Test Failure: dtest-offheap.topology_test.TestTopology.test_decommissioned_node_cant_rejoin(brandon.williams) 3.0.30[Bug](CASSANDRA-18025)cassandra-stress: not all contact point are passed down to driver(smiklosovic) 3.11.15[Bug](CASSANDRA-17919)Capital P gets confused in the parser for a Duration in places where IDENT are needed(maximc) 3.11.16[Bug](CASSANDRA-17202)Avoid unnecessary String.format in QueryProcessor when getting stored prepared statement(ivansenic) Analytics / Sidecar: 1.0.0[Improvement](CASSANDRA-18548)[Analytics] Add .asf.yaml file to the Cassandra Analytics repository(frankgh) To Fix FixVersion: 5.x[Improvement](CASSANDRA-18398)CEP-25: Trie-indexed SSTable format(blambov) 5.x[New Feature](CASSANDRA-18344)Store PreAccept, Accept, Commit, and Apply in a durable log before processing by CommandStores(aleksey) N/A (probably need to fix some and/or use 5.0-target as discussed above) NA[Improvement](CASSANDRA-18521)Unify CQLTester#waitForIndex and SAITester#waitForIndexQueryable(adelapena) NA[Improvement](CASSANDRA-18537)Add JMX utility class to in-jvm dtest to ease development of new tests using JMX() NA[Improvement](CASSANDRA-18217)Allow CQL queries on multiple indexes without ALLOW FILTERING(adelapena) NA[New Feature](CASSANDRA-16222)CEP-28: Reading and Writing Cassandra Data with Spark Bulk Analytics(frankgh) NA[Task](CASSANDRA-18531)BLOG - EOL Announcement for 3.x and 3.0.x(patrick) [Improvement](CASSANDRA-10175)cassandra-stress should be tolerant when a remote node shutdown(smiklosovic) [Improvement](CASSANDRA-18523)CEP-15: (Accord) Join cluster without full transaction log(benedict) [Improvement](CASSANDRA-18175)CEP-15: (Accord) Introduce ExclusiveSyncPoint transactions(benedict) [Improvement](CASSANDRA-18524)CEP-15: (Accord) Separate durable and transient listeners(benedict) [Improvement](CASSANDRA-18171)CEP-15: (Accord) Faster SimpleProgressLog and BurnTest(benedict) [Improvement](CASSANDRA-18174)CEP-15: (Accord/C*) Introduce range transactions(benedict) [Improvement](CASSANDRA-18172)CEP-15: (Accord/C*) Refactor Timestamp/TxnId(benedict) [Improvement](CASSANDRA-18173)CEP-15: (Accord/C*) Introduce RangeDeps(benedict) Phew. So yeah, busy month for us all here. Excited to see Trie-based SSTables land (CASSANDRA-18398), 2 CEP's get voted through, and a whole host of bugfixes, improvements, and new features land. 5.0 is shaping up to be an impressive release. ~Josh