Re: [VOTE] Release Apache Cassandra 4.1.0 (take2)
I’m unsure that without more information it is very helpful to highlight in the release notes. We don’t even have a strong hypothesis tying this issue to 4.1.0 specifically, and don’t have a general policy of highlighting undiagnosed issues in release notes? > On 13 Dec 2022, at 00:48, Jon Meredith wrote: > > > Thanks for the extra time to investigate. Unfortunately no progress on > finding the root cause for this issue, just successful bootstraps in our > attempts to reproduce. I think highlighting the ticket in the release notes > is sufficient and resolving this issue should not hold up the release. > > I agree with Jeff that the multiple concurrent bootstraps are unlikely to be > the issue - I only mentioned in the ticket in case I am wrong. Abe or I will > update the ticket if we find anything new. > >> On Sun, Dec 11, 2022 at 12:33 PM Jeff Jirsa wrote: >> Concurrent shouldn’t matter (they’re non-overlapping in the repro). And I’d >> personally be a bit surprised if table count matters that much. >> >> It probably just requires high core count and enough data that the streams >> actually interact with the rate limiter >> On Dec 11, 2022, at 10:32 AM, Mick Semb Wever wrote: >>> >>> >>> On Sat, 10 Dec 2022 at 23:09, Abe Ratnofsky wrote: Sorry - responded on the take1 thread: Could we defer the close of this vote til Monday, December 12th after 6pm Pacific Time? Jon Meredith and I have been working thru an issue blocking streaming on 4.1 for the last couple months, and are now testing a promising fix. We're currently working on a write-up, and we'd like to hold the release until the community is able to review our findings. >>> >>> >>> >>> Update on behalf of Jon and Abe. >>> >>> The issue raised is CASSANDRA-18110. >>> Concurrent, or nodes with high cpu count and number of tables performing, >>> host replacements can fail. >>> >>> It is still unclear if this is applicable to OSS C*, and if so to what >>> extent users might ever be impacted. >>> More importantly, there's a simple workaround for anyone that hits the >>> problem. >>> >>> Without further information on the table, I'm inclined to continue with >>> 4.1.0 GA (closing the vote in 32 hours), but add a clear message to the >>> release announcement of the issue and workaround. Interested in hearing >>> others' positions, don't be afraid to veto if that's where you're at. >>> >>>
Re: [DISCUSSION] New dependencies for SAI CEP-7
I don’t believe we are ready to be prescriptive about how our randomised tests are written.1) We want as many people to write randomised tests as possible, so do not want to create impediments.2) We don’t, I expect, all agree on what a good randomised test looks like.I think Mike should include carrot search’s utils, and we can see how they look. I certainly do not think Mike should use QuickTheories or CassandraGenerators, as I would not like us to settle on these tools, so it might be a wasted effort.If we want to start discussions about how we might like to coalesce towards a best practice, with tools that support that, we could form a working group - but I think it is not a simple matter right now.On 14 Dec 2022, at 09:47, Mike Adamson wrote:Thanks for your detailed response to this. I am definitely not fixed on using carrot for this so am happy to look at a replacement. I wasn't aware of the addition of QuickTheories or CassandraGenerators. A combination of these could easily supply the functionality we need for the SAI testing. The Generators could definitely replace the functionality in SAIRandomizedTest. I will take a look at these and see if we can work without the carrot generators and will report back in a couple of days on this thread if I can do this easily.As an aside, Caleb and me have already spoken about adding support to Harry for SAI and using this for more large-scale randomized testing of SAI.On Tue, 13 Dec 2022 at 18:24, Josh McKenziewrote:Whatever we decide on, let's make sure we document it so newcomers on the project (or really anyone new to property based testing) can better discover those things.https://cassandra.apache.org/_/development/testing.htmlOn Tue, Dec 13, 2022, at 1:08 PM, David Capwell wrote:Speaking to Caleb in Slack, so putting the main comments I have there here…I am not -1 on this new dependency, but more asking what we should use for random testing moving forward…. ATM we have the following:1) QuickTheories - I feel like I am the only user at this point…2) 1-off - many reinvent random testing for a specific class; using Random, ThreadLocalRandom, UUID.randomUUID(), and lang3 classes (such as org.apache.commons.lang3.RandomUtils)3) Harry - even though the main API is for cluster testing, this is built on-top of random generation so could be used for low level random testing (just less fleshed out for this use-case)4) Simulator - same as Harry, built on top of a random generator and not fleshed out for low level random testingAnother reason I ask this is I have a fuzz testing that I have developed for Accord testing that generates random valid CQL statements to make sure we “do the right thing” and have been struggling with the question “where do I put this” and “what random do I use?”. I built this off QuickTheories as I have a lot of utilities for building all supported Tables and Types so really quick do bootstrap, and every other random testing thing we have are less fleshed out… so if we add yet another random testing library what “should” we be using? Do we build on-top of it to get to the same level QuickTheory is (see org.apache.cassandra.utils.Generators, org.apache.cassandra.utils.CassandraGenerators, and org.apache.cassandra.utils.AbstractTypeGenerators)?On Dec 13, 2022, at 9:21 AM, Caleb Rackliffe wrote:We need random generators no matter what for these tests, so I think what we need to decide is whether to continue to use Carrot or migrate those to QuickTheories, along the lines of what we have now in org.apache.cassandra.utils.Generators.When it comes to a library like this, the thing I would optimize for is how much it already provides (and therefore how much we need to write and maintain ourselves). If you look at something like NumericTypeSortingTest in the 18058 branch, it's pretty compact w/ Carrot's RandomizedTest in use, but I suppose it could also use IntegersDSL from QT...(Not that it matters, but just for reference, we do use com.carrotsearch.hppc already.)On Tue, Dec 13, 2022 at 10:14 AM Mike Adamson wrote:Can you talk more about why? There are several ways to do random testing in-tree ATM, so wondering why we need another oneI can see one mechanism for random testing in-tree. That is the Simulator but that seems primarily involved in the random orchestration of operations. My apologies if I have simplified its significance. Apart from that, I can only see different usages of Random in unit tests. I admit I have not looked beyond this at dtests.The random testing in SAI is more focussed on the behaviour of the low-level index structures and flow of data to / from these. Using randomly generated values in tests has proved invaluable in highlighting edge conditions in the code. This above library was only added to provide us with a rich set of random generators. I am happy to look at removing this library if its inclusion is contentious.On Mon, 12 Dec 2022 at 19:41
Re: [VOTE] CEP-25: Trie-indexed SSTable format
+1 > On 19 Dec 2022, at 13:00, Branimir Lambov wrote: > > > Hi everyone, > > I'd like to propose CEP-25 for approval. > > Proposal: > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-25%3A+Trie-indexed+SSTable+format > Discussion: https://lists.apache.org/thread/3dpdg6dgm3rqxj96cyhn58b50g415dyh > > The vote will be open for 72 hours. > Votes by committers are considered binding. > A vote passes if there are at least three binding +1s and no binding vetoes. > > Thank you, > Branimir
Re: [DISCUSS] CEP-26: Unified Compaction Strategy
I’m personally very excited by this work. Compaction could do with a spring clean and this feels to formalise things much more cleanly, but density tiering in particular is something I’ve wanted to incorporate for years now, as it should significantly improve STCS behaviour (most importantly reducing read amplification and the amount of disk space required, narrowing the performance delta to LCS in these important dimensions), and simplifies re-levelling of LCS, making large streams much less painful.On 21 Dec 2022, at 07:19, Henrik Ingo wrote:I noticed the CEP doesn't link to this, so it should be worth mentioning that the UCS documentation is available here: https://github.com/datastax/cassandra/blob/ds-trunk/doc/unified_compaction.mdBoth of the above seem to do a poor job referencing the literature we've been inspired by. I will link to Mark Callaghan's blog on the subject:http://smalldatum.blogspot.com/2018/07/tiered-or-leveled-compaction-why-not.html?m=1...and lazily will also borrow from Mark a post that references a bunch of LSM (not just UCS related) academic papers: http://smalldatum.blogspot.com/2018/08/name-that-compaction-algorithm.html?m=1Finally, it's perhaps worth mentioning that UCS has been in production in our Astra Serverless cloud service since it was launched in March 2021. The version described by the CEP therefore already incorporates some improvements based on observed production behaviour.Henrik On Mon, 19 Dec 2022, 15:41 Branimir Lambov,wrote:Hello everyone,I would like to open the discussion on our proposal for a unified compaction strategy that aims to solve well-known problems with compaction and improve parallelism to permit higher levels of sustained write throughput.The proposal is here: https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-26%3A+Unified+Compaction+StrategyThe strategy is based on two main observations:- that tiered and levelled compaction can be generalized as the same thing if one observes that both form exponentially-growing levels based on the size of sstables (or non-overlapping sstable runs) and trigger a compaction when more than a given number of sstables are present on one level;- that instead of "size" in the description above we can use "density", i.e. the size of an sstable divided by the width of the token range it covers, which permits sstables to be split at arbitrary points when the output of a compaction is written and still produce a levelled hierarchy.The latter allows us to shard the compaction space into progressively higher numbers of shards as data moves to the higher levels of the hierarchy, improving parallelism, space requirements and the duration of compactions, and the former allows us to cover the existing strategies, as well as hybrid mixtures that can prove more efficient for some workloads.Thank you,Branimir
Re: [DISCUSSION] Cassandra's code style and source code analysis
I like 3 or 4. We need to be sure we have a way of deactivating the check with code comments tho, as Java 8 has some bug with import order that can rarely break compilation, so we need to have some mechanism for permitting a different import order. Did we decide any changes to star imports? > On 22 Dec 2022, at 14:53, Maxim Muzafarov wrote: > > Hello everyone, have a great vacation and happy holidays to all! > > > I've completed a small research about how the classe's import order > rule are spread in the Apache projects. Some of the projects don't > have any restrictions over the imports even if they are using the > checkstyle configuration. The other ones may have only the consensus > over the imports, but they are not reflected in the checkstyle yet > (e.g. Kafka). The conclusion here can only be that there is a very > large variability in the classe's import order, so we have to agree on > the order on our own. > > You can find the projects, IDEs and frameworks and their corresponding > classe's import order below: > https://mmuzaf.github.io/blog/Java_Import_Orders.html > > > Most of the time during development in an IDE the classe's imports > remains collapsed, so from my point of view the following things > related to the classe's import comes into the first place to consider: > > - a PR review: newly imports must be clearly visible; > - try to minimize the total amount of affected files; > - the import order rule must be implemented in a simple way and well > supported by IDEs and its plugins; > > In addition to the last mentioned option, the checkstyle itself has > some limitations also. For instance, the ImportOrder has a limitation > by design to enforce an empty line between groups ("java", "javax"), > or CustomImportOrder may have only up to 4 custom groups separated by > a blank line. > > > > Based on all of the above I can propose the following classe's order. > All of them are tested on the latest changes from the trunk branch > (commit hash: b171b4ba294126e985d0ee629744516f19c8644e) > > > 1. Total 2 groups, 3072 files to change > > ``` > all other imports > [blank line] > static all other imports > ``` > > 2. Total 3 groups, 2345 files to change > > ``` > java.* > javax.* > [blank line] > all other imports > [blank line] > static all other imports > ``` > > 3. Total 5 groups, 2968 files to change > > ``` > org.apache.cassandra.* > [blank line] > java.* > [blank line] > javax.* > [blank line] > all other imports > [blank line] > static all other imports > ``` > > 4. Total 5 groups, 1792 files to change > > ``` > java.* > javax.* > [blank line] > com.* > net.* > org.* > [blank line] > org.apache.cassandra.* > [blank line] > all other imports > [blank line] > static all other imports > ``` > > 5. Total 2 groups, 3114 files to change > > ``` > java.* > javax.* > org.apache.cassandra.* > all other imports > [blank line] > static all other imports > ``` > > > Of course, any suggestions are really appreciated. > Please, share your thoughts. > >> On Thu, 15 Dec 2022 at 17:48, Mick Semb Wever wrote: >>> Another angle I forgot to mention is that this is quite a big patch and >>> there are quite big pieces of work coming, being it CEP-15, for example. So >>> I am trying to figure out if we are ok to just merge this work first and >>> devs doing CEP-15 will need to rework their imports or we merge this after >>> them so we will fix their stuff. I do not know what is more preferable. >> >> >> >> Thank you for bringing this point up Stefan. >> >> I would be actively reaching out to all those engaged with current CEPs, >> asking them the rebase impact this would cause and if they are ok with it. >> The CEPs are our priority, and we have a significant amount of them in >> progress compared to anything we've had for many years. >> >> >>
Merging CEP-15 to trunk
Hi Everyone, I hope you all had a lovely holiday period. Those who have been following along will have seen a steady drip of progress into the cep-15-accord feature branch over the past year. We originally discussed that feature branches would merge periodically into trunk, and we are long overdue. With the release of 4.1, it’s time to rectify that. Barring complaints, I hope to merge the current state to trunk within a couple of weeks. This remains a work in progress, but will permit users to experiment with the alpha version of Accord and provide feedback, as well as phase the changes to trunk.
Intra-project dependencies
Those of us who have developed the in-jvm-dtest-api will know that the project’s approach to developing libraries is untenable for more complex projects. Accord makes this a pressing concern, but we would also benefit from separating utilities to their own library for use by the dtest-api and Accord, and also the dtest-api could be evolved more easily. I see basically four options: Continue requiring a release vote for every library change prior to importing it to another project Bring libraries into the C* tree Deploy snapshots for our internal modules, that we import until release (at which publish real jars) Use git submodules I think (4) is the only sensible option. It permits different development branches to easily reference different versions of a library and also to easily co-develop them - from within the same IDE project, even. We might even be able to avoid additional release votes as a matter of course, by compiling the library source as part of the C* release, so that they adopt the C* release vote (or else we may periodically release the library as we do other releases) (1) is unworkable, as this means a release vote for every patch that affects both a library and a module that imports it. Even for the dtest-api, this has been excruciating and has lead to workarounds. (2) incurs additional development work porting changes between C* versions that share a logical library version, and the potential for the “same” version to drift unintentionally (3) makes parallel branch development trickier as each needs its own unique snapshot release, and release votes become more complicated as we require a chain of releases, one for each dependency. I might be missing something, does anyone have any other bright ideas for approaching this problem? I’m sure there are plenty of opinions out there.
Re: Intra-project dependencies
I guess option 5 is what we have today in cep-15, have the build file grab the relevant SHA for the library. This way you maintain a precise SHA for builds and scripts don’t have to be modified. I believe this is also possible with git submodules, but I’m happy to bake this into our build file instead with a script. > As the library itself no longer has an explicit version, what I presume you > meant by logical version. I mean that we don’t want to duplicate work and risk diverging functionality maintaining what is logically (meant to be) the same code. As a developer, managing all of the branches is already a pain. Libraries naturally have a different development cadence to the main project, and tying the development to C* versions is just an unnecessary ongoing burden (and risk) that we can avoid. There’s also an additional penalty: we reduce the likelihood of outside contributions to the libraries only. Accord in particular I hope will attract outside interest if it is maintained as a separate library, as it has broad applicability, and is likely of academic interest. Tying it to C* version and more tightly coupling with C* codebase makes that less likely. We might also see folk interested in our utilities, or our simulator framework, if they were to be maintained separately, which could be valuable. > On 16 Jan 2023, at 10:49, Mick Semb Wever wrote: > > >> I think (4) is the only sensible option. It permits different development >> branches to easily reference different versions of a library and also to >> easily co-develop them - from within the same IDE project, even. > > > I've only heard horror stories about submodules. The challenges they bring > should be listed and checked. > > Some examples > - you can no longer just `git clone …` (and we clone automatically in a > number of places) > - same with `git pull …` (easy to be left with out-of-sync submodules) > - permanence from a git SHA no longer exists > - our releases get more complicated (our source tarballs are the asf > releases) > - handling patches cover submodules > - switching branches, and using git worktrees, during dv > > I see (4) as a valid option, but concerned with the amount of work required > to adapt to it, and whether it will only make it more complicated for the new > contributor to the project. For example the first two points are addressed by > remembering to do `git clone --recurse-submodules …` . And who would be > fixing our build/test/release scripts to accommodate? > > Not blockers, just concerns we need to raise and address. > > >> We might even be able to avoid additional release votes as a matter of >> course, by compiling the library source as part of the C* release, so that >> they adopt the C* release vote (or else we may periodically release the >> library as we do other releases) > > > Yes. Today we do a combination of first (3) and then (1). Having to make a > release of these libraries every time a patch (/feature branch) is completing > is a horror story in itself. > >> I might be missing something, does anyone have any other bright ideas for >> approaching this problem? I’m sure there are plenty of opinions out there. > > > Looking at the problem with these libraries, > - we don't need releases > - we don't have a clean version/branch parity to in-tree > - codebase parity between branches is important for upgrade tests (shared > classloaders) > > For (2) you mention drift of the "same" version, isn't this only a problem > for dtest-api in the way it requires the "same version" of a codebase for > compatibility when running upgrade tests? As the library itself no longer has > an explicit version, what I presume you meant by logical version. > > To begin with, I'm leaning towards (2) because it is a cognitive re-use of > our release branches, and the problems around classpath compatibility can be > solved with tests. I'm sure I'm not seeing the whole picture though… >
Re: Merging CEP-15 to trunk
My goal isn’t to ask if others believe we have the right to merge, only to invite feedback if there are any specific concerns. Large pieces of work like this cause headaches and concerns for other contributors, and so it’s only polite to provide notice of our intention, since probably many haven’t even noticed the feature branch developing. The relevant standard for merging a feature branch, if we want to rehash that, is that it is feature- and bug-neutral by default, ie that a release could be cut afterwards while maintaining our usual quality standards, and that the feature is disabled by default, yes. It is not however feature-complete or production read as a feature; that would prevent any incremental merging of feature development. > On 16 Jan 2023, at 15:57, J. D. Jordan wrote: > > I haven’t been following the progress of the feature branch, but I would > think the requirements for merging it into master would be the same as any > other merge. > > A subset of those requirements being: > Is the code to be merged in releasable quality? Is it disabled by a feature > flag by default if not? > Do all the tests pass? > Has there been review and +1 by two committer? > > If the code in the feature branch meets all of the merging criteria of the > project then I see no reason to keep it in a feature branch for ever. > > -Jeremiah > > >> On Jan 16, 2023, at 3:21 AM, Benedict wrote: >> >> Hi Everyone, I hope you all had a lovely holiday period. >> >> Those who have been following along will have seen a steady drip of progress >> into the cep-15-accord feature branch over the past year. We originally >> discussed that feature branches would merge periodically into trunk, and we >> are long overdue. With the release of 4.1, it’s time to rectify that. >> >> Barring complaints, I hope to merge the current state to trunk within a >> couple of weeks. This remains a work in progress, but will permit users to >> experiment with the alpha version of Accord and provide feedback, as well as >> phase the changes to trunk.
Re: Intra-project dependencies
;we can always change this later"...It seems to me that especially now, and probably also after 5.0 is released, we will in any case only have a single version of Cassandra using a singgle version of Accord. So at least to begin with, it's the least effort to keep it in-tree, to avoid the overhead of git submodules, or having to make releases, etc. The separate constituency of Accord-only developers can be satisfied by keeping Accord in its own directory, could even be a top-level directory, and a small build system that can build a separate Accord jar file. You could even maintain a separate github repo just for advertising purposes. (Just like github.com/apache/cassandra isn't the official git repo for Cassandra either.)If both of my assumptions above are true, then from a Cassandra point of view there's not much benefit having Accord separately, but if 3rd party interest in Accord grows, then it could indeed be split out into its own repository at that point. The main motivation then would be to service those 3rd party developers who aren't so interested in Cassandra. But this split would only be done once it is known that such a community will form.Thoughts?henrikOn Mon, Jan 16, 2023 at 2:30 PM Josh McKenzie <jmcken...@apache.org> wrote: - permanence from a git SHA no longer existsWith the caveat that I haven't worked w/submodules before and only know about them from a cursory search, it looks like git-submodule status would show us the sha for submodules and we could have parent projects reference specific shas to pull for submodules to build? https://git-scm.com/docs/git-submodule/#Documentation/git-submodule.txt-status--cached--recursive--ltpathgt82308203It seems like our use case is one of the primary ones git submodules are designed to address.On Mon, Jan 16, 2023, at 6:40 AM, Benedict wrote:I guess option 5 is what we have today in cep-15, have the build file grab the relevant SHA for the library. This way you maintain a precise SHA for builds and scripts don’t have to be modified.I believe this is also possible with git submodules, but I’m happy to bake this into our build file instead with a script.> As the library itself no longer has an explicit version, what I presume you meant by logical version.I mean that we don’t want to duplicate work and risk diverging functionality maintaining what is logically (meant to be) the same code. As a developer, managing all of the branches is already a pain. Libraries naturally have a different development cadence to the main project, and tying the development to C* versions is just an unnecessary ongoing burden (and risk) that we can avoid.There’s also an additional penalty: we reduce the likelihood of outside contributions to the libraries only. Accord in particular I hope will attract outside interest if it is maintained as a separate library, as it has broad applicability, and is likely of academic interest. Tying it to C* version and more tightly coupling with C* codebase makes that less likely. We might also see folk interested in our utilities, or our simulator framework, if they were to be maintained separately, which could be valuable.On 16 Jan 2023, at 10:49, Mick Semb Wever <m...@apache.org> wrote:I think (4) is the only sensible option. It permits different development branches to easily reference different versions of a library and also to easily co-develop them - from within the same IDE project, even.I've only heard horror stories about submodules. The challenges they bring should be listed and checked.Some examples - you can no longer just `git clone …` (and we clone automatically in a number of places) - same with `git pull …` (easy to be left with out-of-sync submodules) - permanence from a git SHA no longer exists - our releases get more complicated (our source tarballs are the asf releases) - handling patches cover submodules - switching branches, and using git worktrees, during dvI see (4) as a valid option, but concerned with the amount of work required to adapt to it, and whether it will only make it more complicated for the new contributor to the project. For example the first two points are addressed by remembering to do `git clone --recurse-submodules …` . And who would be fixing our build/test/release scripts to accommodate?Not blockers, just concerns we need to raise and address. We might even be able to avoid additional release votes as a matter of course, by compiling the library source as part of the C* release, so that they adopt the C* release vote (or else we may periodically release the library as we do other releases)Yes. Today we do a combination of first (3) and then (1). Having to make a release of these libraries every time a patch (/feature branch) is completing is a horror story in itself.I might be missing something, does anyone have any other bright ideas for approaching this problem? I’m sure there are plenty of opinions out there.Looking at the problem with
Re: Merging CEP-15 to trunk
That’s fair, though for long term contributors probably the risk is relatively low on that front. I guess that’s something we can perhaps raise as part of each CEP if we envisage it taking several months of development?> Did we document this or is it in an email thread somewhere?It’s probably buried in one of the many threads we’ve had about related topics on releases and development. We’ve definitely discussed feature branches before, and I recall discussing a goal of merging ~quarterly. But perhaps like most sub topics it didn’t get enough visibility, in which case this thread I suppose can serve as a dedicated rehash and we can formalise whatever falls out.In theory as Jeremiah says there’s only the normal merge criteria. But that includes nobody saying no to a piece of work or raising concerns, and advertising the opportunity to say no is important for that IMO.On 16 Jan 2023, at 16:36, J. D. Jordan wrote:My only concern to merging (given all normal requirements are met) would be if there was a possibility that the feature would never be finished. Given all of the excitement and activity around accord, I do not think that is a concern here. So I see no reason not to merge incremental progress behind a feature flag.-JeremiahOn Jan 16, 2023, at 10:30 AM, Josh McKenzie wrote:Did we document this or is it in an email thread somewhere?I don't see it on the confluence wiki nor does a cursory search of ponymail turn it up.What was it for something flagged experimental?1. Same tests pass on the branch as to the root it's merging back to2. 2 committers eyes on (author + reviewer or 2 reviewers, etc)3. Disabled by default w/flag to enableSo really only the 3rd thing is different right? Probably ought to add an informal step 4 which Benedict is doing here which is "hit the dev ML w/a DISCUSS thread about the upcoming merge so it's on people's radar and they can coordinate".On Mon, Jan 16, 2023, at 11:08 AM, Benedict wrote:My goal isn’t to ask if others believe we have the right to merge, only to invite feedback if there are any specific concerns. Large pieces of work like this cause headaches and concerns for other contributors, and so it’s only polite to provide notice of our intention, since probably many haven’t even noticed the feature branch developing.The relevant standard for merging a feature branch, if we want to rehash that, is that it is feature- and bug-neutral by default, ie that a release could be cut afterwards while maintaining our usual quality standards, and that the feature is disabled by default, yes. It is not however feature-complete or production read as a feature; that would prevent any incremental merging of feature development.> On 16 Jan 2023, at 15:57, J. D. Jordan <jeremiah.jor...@gmail.com> wrote:> > I haven’t been following the progress of the feature branch, but I would think the requirements for merging it into master would be the same as any other merge.> > A subset of those requirements being:> Is the code to be merged in releasable quality? Is it disabled by a feature flag by default if not?> Do all the tests pass?> Has there been review and +1 by two committer?> > If the code in the feature branch meets all of the merging criteria of the project then I see no reason to keep it in a feature branch for ever.> > -Jeremiah> > >> On Jan 16, 2023, at 3:21 AM, Benedict <bened...@apache.org> wrote:>> >> Hi Everyone, I hope you all had a lovely holiday period. >> >> Those who have been following along will have seen a steady drip of progress into the cep-15-accord feature branch over the past year. We originally discussed that feature branches would merge periodically into trunk, and we are long overdue. With the release of 4.1, it’s time to rectify that. >> >> Barring complaints, I hope to merge the current state to trunk within a couple of weeks. This remains a work in progress, but will permit users to experiment with the alpha version of Accord and provide feedback, as well as phase the changes to trunk.
Re: Merging CEP-15 to trunk
Could you file a bug report with more detail about which classes you think are lacking adequate documentation in each project, and what you would like to see? We would welcome your participation.On 16 Jan 2023, at 17:28, Jacek Lewandowski wrote:Hi,It would be great if some documentation got added to the code you want to merge. To me, it would be enough to just quickly characterize on the class level what is the class for and what are the expectations. This is especially important for Accord API classes because now it is hard to review whether the implementation in Cassandra conforms the API requirements. Given it is going to be a possibility for others to try Accord before the release, it would be good to create some CQL syntax documentation, something like a chapter in https://cassandra.apache.org/doc/latest/cassandra/cql/index.html but for unreleasedCassandra version or a blog post, so that the syntax is known to the users and they can quickly get into speed, hopefully reporting any problems soon.- - -- --- - -Jacek LewandowskiOn Mon, 16 Jan 2023 at 17:52, Benedict <bened...@apache.org> wrote:That’s fair, though for long term contributors probably the risk is relatively low on that front. I guess that’s something we can perhaps raise as part of each CEP if we envisage it taking several months of development?> Did we document this or is it in an email thread somewhere?It’s probably buried in one of the many threads we’ve had about related topics on releases and development. We’ve definitely discussed feature branches before, and I recall discussing a goal of merging ~quarterly. But perhaps like most sub topics it didn’t get enough visibility, in which case this thread I suppose can serve as a dedicated rehash and we can formalise whatever falls out.In theory as Jeremiah says there’s only the normal merge criteria. But that includes nobody saying no to a piece of work or raising concerns, and advertising the opportunity to say no is important for that IMO.On 16 Jan 2023, at 16:36, J. D. Jordan <jeremiah.jor...@gmail.com> wrote:My only concern to merging (given all normal requirements are met) would be if there was a possibility that the feature would never be finished. Given all of the excitement and activity around accord, I do not think that is a concern here. So I see no reason not to merge incremental progress behind a feature flag.-JeremiahOn Jan 16, 2023, at 10:30 AM, Josh McKenzie <jmcken...@apache.org> wrote:Did we document this or is it in an email thread somewhere?I don't see it on the confluence wiki nor does a cursory search of ponymail turn it up.What was it for something flagged experimental?1. Same tests pass on the branch as to the root it's merging back to2. 2 committers eyes on (author + reviewer or 2 reviewers, etc)3. Disabled by default w/flag to enableSo really only the 3rd thing is different right? Probably ought to add an informal step 4 which Benedict is doing here which is "hit the dev ML w/a DISCUSS thread about the upcoming merge so it's on people's radar and they can coordinate".On Mon, Jan 16, 2023, at 11:08 AM, Benedict wrote:My goal isn’t to ask if others believe we have the right to merge, only to invite feedback if there are any specific concerns. Large pieces of work like this cause headaches and concerns for other contributors, and so it’s only polite to provide notice of our intention, since probably many haven’t even noticed the feature branch developing.The relevant standard for merging a feature branch, if we want to rehash that, is that it is feature- and bug-neutral by default, ie that a release could be cut afterwards while maintaining our usual quality standards, and that the feature is disabled by default, yes. It is not however feature-complete or production read as a feature; that would prevent any incremental merging of feature development.> On 16 Jan 2023, at 15:57, J. D. Jordan <jeremiah.jor...@gmail.com> wrote:> > I haven’t been following the progress of the feature branch, but I would think the requirements for merging it into master would be the same as any other merge.> > A subset of those requirements being:> Is the code to be merged in releasable quality? Is it disabled by a feature flag by default if not?> Do all the tests pass?> Has there been review and +1 by two committer?> > If the code in the feature branch meets all of the merging criteria of the project then I see no reason to keep it in a feature branch for ever.> > -Jeremiah> > >> On Jan 16, 2023, at 3:21 AM, Benedict <bened...@apache.org> wrote:>> >> Hi Everyone, I hope you all had a lovely holiday period. >> >> Those who have been following along will have seen a steady drip of progress into the cep-15-accord feature branch over the past year. We originally discussed that feature branches would merge periodically into trunk, a
Re: Intra-project dependencies
We have a build script that is invoked by ant to grab a specific SHA (or HEAD of a branch). We were previously just grabbing HEAD but this has the problems mentioned elsewhere in the thread, amongst others. I don’t think it probably matters much if we use a build script or submodules.I am driven in part by wanting to maintain the library status and not wanting to discard the work done to maintain this, but no less also by my expectation that tying Accord to C* version would entail additional maintenance burden (that might in the near term perhaps fall predominantly on me).I could be wrong in this prediction of course, but it seems to be a one-sided trade. I don’t think there‘s much extra work with separate repositories even in the worst case of a 1:1 mapping, and we can more easily reverse this decision if there’s no external interest and we really are just 1:1 for several releases.That said, clearly we don’t want to pursue this approach for every subsystem. So perhaps one of the decisive reasons is indeed the broader utility, but the fact the library is fully decoupled is by itself a strong reason IMO.I guess an interesting thought exercise to validate this is what other idealised subsystems I might want to apply this approach to. I’ll ponder that.On 16 Jan 2023, at 18:32, Henrik Ingo wrote:Hi BenedictAt least for my part, again, I'm not (yet) trying to argue for or against a particular alternative. So I think you'll find that if you allow a few more iterations of discussion, we can gravitate to some good consensus. Or failing that, we can at least gravitate around a small number of alternatives and then argue about those :-D It seems also in your email, the strongest argument for keeping a separate library, is your desire or expectation that Accord would attract significant 3rd party interest. And - this is btw also some advice Magnus Carlsen would give - your main argument therefore is, if we expect we need to make a specific move in the future, it's usually best to just do it immediately.I didn't write in my previous email, but I did have in mind that one drawback with the proposal of later extracting Accord out of Cassandra into its own repository would be to lose the history of commits. (At least without significant effort to keep/recreate the history.) For example, there could be commits in the Accord history that also edit files in Cassandra. So yes, I agree that if this is a major goal, then keeping Accord development in its own repository is the right choice.This then leads to the question should the link from Cassandra to Accord be via git sub-modules or via some bash code in the build system. I now remember something that was a major problem for years in the MongoDB CI system, and I believe this is also a problem with our dtests? That the nightly CI system would just check out HEAD of each module, and then compile them and run tests. This had the problem that it was impossible to return to a specific failure, say, a week later, and expect to rebuild and retest the same combination, because the system would just check out and build whatever the HEAD was at that date. (The only way to test the actual SHA you had been bisecting or patching was to submit it as a patch to the CI system. So if a test setup had 5 sub modules, and you were fixing a bug in one of them, you had to "patch" the 4 other ones too, simply because otherwise the CI system wouldn't check out the right position in their history.)So, whatever method we choose, it's important that our CI system and other tools can know and track the correct and current SHA for each sub-module. Presumably git sub-modules actually are the best answer to this need. How have you dealt with this in Accord so far?One point: I wouldn't directly compare dtest and Accord though. For a test framework, it's the dtest framework that is consuming a Cassandra version, while for Accord it's Cassandra that depends on a specific Accord version. Because of this, the same solution may or may not be right for both of them.henrikOn Mon, Jan 16, 2023 at 6:44 PM Benedict <bened...@apache.org> wrote:How often have we modified Paxos? There are currently no proposals to develop Accord further after the initial release. So I think it is very likely that Accord development will decouple from Cassandra version, unless there is significant external interest that drives it.Furthermore, the idea of revisiting this later is problematic. We can’t easily decouple Accord if it becomes tightly coupled with Cassandra, which becomes quite likely when the builds are co-dependent. We have spent great effort developing them separately to avoid this.You can’t go back later and recover lost interest. How many projects have adopted ZAB, versus Raft?None of this also addresses the wider need for reform of our approach here, for both the dtest-api and the simulator.I’m still not clear on the concrete downsides of maintaining a s
Re: Intra-project dependencies
Benedict, experience based on developing one feature against one branch doesn't face the problems of working, and switching frequently, between branches.Mick, please take a look at the ongoing development. Over the last week I have been actively developing five separate PRs against each repository at once (ten in total), with not insignificant changes between them. I am quite experienced with actively developing against multiple branches, and of extrapolating this experience to multiple C* versions, and your hypothetical concerns do not invalidate that experience.- patches are off submodule SHAs, not the submodule's HEAD,A SHA would point to the HEAD of a given branch, at the time of merge, just by SHA? I’ve no idea what you imagine here, but this just ensures that a given SHA of the importing project continues to compile correctly when it is no longer HEAD. It does not mean there’s no HEAD that corresponds directly to the SHA of the importing project’s HEAD.- you need to be making commits to all branches (and forward merging) anyway to update submodule SHAs,Exactly as you would any library upgrade?- if development is active on trunk, and then you need an update on an older branch, you have to accommodate to backporting all those trunk changes (or introduce the same branching in the submodule),If you do feature development against Accord then you will obviously branch it? You would only make bug fixes to a bug fix branch. I’m not sure what you think is wrong here.On 16 Jan 2023, at 19:52, Mick Semb Wever wrote: - permanence from a git SHA no longer existsWith the caveat that I haven't worked w/submodules before and only know about them from a cursory search, it looks like git-submodule status would show us the sha for submodules and …That isn't one SHA, but a collection of SHAs.I'm thinking about reproducible builds, switching between branches, and git bisecting, this stuff needs to just work. A build that fails fast if a submodule is not on a specific SHA helps but introduces more problems. we could have parent projects reference specific shas to pull for submodules to build? https://git-scm.com/docs/git-submodule/#Documentation/git-submodule.txt-status--cached--recursive--ltpathgt82308203Yes, we can enforce a 1:1 relationship from parent SHA to submodule SHAs, but then what's the point: you have both the headache of submodules and having to always commit to multiple branches and forward merge.That is, with fixed parent-to-submodule SHA relationships, these new challenges are introduced: - patches are off submodule SHAs, not the submodule's HEAD,- you need to be making commits to all branches (and forward merging) anyway to update submodule SHAs,- if development is active on trunk, and then you need an update on an older branch, you have to accommodate to backporting all those trunk changes (or introduce the same branching in the submodule),IMHO submodules are just trading one set of problems for another. And overall life is simpler if we reduce the cognitive burden to just what we have today: forward merging.Benedict, experience based on developing one feature against one branch doesn't face the problems of working, and switching frequently, between branches.The problem of wanting an external repository for these libraries to promote external non-cassandra consumers I would solve by exporting the code out of cassandra (not trying to import it). Git history is easy to keep/replicate. We were talking about doing this with the jamm library, given its primary development is currently with C* but we want it to appear as a standalone library (/github codebase).
Re: Intra-project dependencies
The answer to all your questions is “like any other library” - this is a procedural hack to ease development. There are alternative isomorphic hacks, like compiling source jars from Accord and including them in the C* tree, if it helps your mental model. > you stated that a goal was to avoid maintaining multiple branches. No, I stated that a goal was to *decouple* development of Accord from C*. I don’t see why you would take that to mean there are no branches of Accord, as that would quite clearly be incompatible with the C* release strategy. > On 17 Jan 2023, at 07:36, Mick Semb Wever wrote: > > >>> … extrapolating this experience to multiple C* versions > > To include forward-merging, bisecting old history, etc etc. that's a leap of > faith that I believe deserves the discussion. > >>> - patches are off submodule SHAs, not the submodule's HEAD, >> >> A SHA would point to the HEAD of a given branch, at the time of merge, just >> by SHA? I’ve no idea what you imagine here, but this just ensures that a >> given SHA of the importing project continues to compile correctly when it is >> no longer HEAD. It does not mean there’s no HEAD that corresponds directly >> to the SHA of the importing project’s HEAD. > > > That wasn't my concern. Rather that you need to know in advance when the SHA > is not HEAD. You can't commit off a past SHA. Once you find out (and how does > this happen?) that the submodule code is not HEAD what do you then do? What > if fast-forwarding the submodule to HEAD's SHA breaks things, do you now have > to fix that or introduce branching in the submodule? If the submodule doesn't > have releases, is it doing versioning, and if not how are branches > distinguished? > > Arn't these all fair enquiries to raise? > >>> - you need to be making commits to all branches (and forward merging) >>> anyway to update submodule SHAs, >> >> Exactly as you would any library upgrade? > > > Correct. submodules does not solve/remove the need to commit to multiple > branches and forward merge. > Furthermore submodules means at least one additional commit, and possibly > twice as many commits. > > >>> - if development is active on trunk, and then you need an update on an >>> older branch, you have to accommodate to backporting all those trunk >>> changes (or introduce the same branching in the submodule), >> >> If you do feature development against Accord then you will obviously branch >> it? You would only make bug fixes to a bug fix branch. I’m not sure what you >> think is wrong here. > > > That's not obvious, you stated that a goal was to avoid maintaining multiple > branches. Sure there's benefits to a lazy branching approach, but it > contradicts your initial motivations and introduces methodology changes that > are worth pointing out. What happens when there are multiple consumers of > Accord, and (like the situation we face with jamm) its HEAD is well in front > of anything C* is using. > > As Henrik states, the underlying problem doesn't change, we're just choosing > between trade-offs. My concern is that we're not even doing a very good job > of choosing between the trade-offs. Based on past experiences with > submodules: that started with great excitement and led to tears and > frustration after a few years; I'm only pushing for a more thorough > discussion and proposal. > > >
Re: Merging CEP-15 to trunk
> but the pre-commit gateway here is higher than the previous tickets being > worked on Which tickets, and why? > On 17 Jan 2023, at 07:43, Mick Semb Wever wrote: > > > > >> Could you file a bug report with more detail about which classes you think >> are lacking adequate documentation in each project, and what you would like >> to see? > > > I suggest instead that we open a task ticket for the merge. > > I 100% agree with merging work incrementally, under a feature flag, but the > pre-commit gateway here is higher than the previous tickets being worked on. > API changes, pre-commit test results, and high (/entry) level comments, all > deserve any extra eyeballs available. >
Re: Intra-project dependencies
I am certainly not proposing any certainty about outside interest, but I think as the only full implementation of a leaderless protocol in existence, as well as an open source pluggable distributed transaction protocol, the chance of some interest is not vanishingly small (once it is proven in Cassandra). Whether that will translate to any useful interest is even less certain, but I think it would be valuable if it transpires.It’s also not the only reason to develop this project as a library, and wasn’t one of the reasons given when we discussed this when the proposal was discussed initially. The main reasons were ensuring that the transaction system in Cassandra was not tied to Accord, and to allow better testing in isolation.Regarding the use of snapshots, this isn’t impossible Henrik - I floated this as an option. But besides the additional overhead during development, this does not maintain reproducible builds, as the snapshot can change. Perhaps we could introduce permanent snapshots for a given SHA, but at that point it seems to just make more sense to use submodules. Including snapshots inside the lib directory until release would seem to be fine, and then perform parallel release votes.On 17 Jan 2023, at 21:04, Henrik Ingo wrote:Hi DerekSomewhat of a newcomer myself, it seems the answers to your excellent questions are: * We don't all agree with the premise that Accord will attract substantial outside interest. Even so, my personal opinion is that whether that happens or not, it's not wrong to aspire toward or plan for such a future. * Yes, just using Accord as a library dependency would be the normal thing to do, but that introduces a need to create Accord releases to match Cassandra releases. Since ASF mandates a 3 day voting process to release software artifacts, this creates a lot of bureaucratic overhead, which is why this otherwise sane alternative is nobody's favorite. (Cassandra releases cannot or should not depend on snapshot releases of libraries. * So we are discussing various alternatives that keep Accord separate, while at the same time recording some link about which exact version of Accord was checked out.henrikOn Tue, Jan 17, 2023 at 7:23 PM Derek Chen-Becker <de...@chen-becker.org> wrote:Actually, re-reading the thread, I think I missed the initial point brought up and got lost in the discussion specific to submodules. What is the technical reason for bringing Accord in-tree? While I think submodules are the best way to include source in-tree, I'm not sure this is actually the correct thing to do in this case. Don't we already have mechanisms to deal with snapshot versions of library dependencies in the build? Do we need release votes for snapshots? Thanks, Derek On Tue, Jan 17, 2023 at 9:25 AM Derek Chen-Becker <de...@chen-becker.org> wrote: > > I'd like to go back to Benedict's initial point: if we have a new > consensus protocol that other projects would potentially be interested > in, then by all means it should be its own project. Let's start with > that as a basis for discussion, because from my reading it seems like > people might be disagreeing with that initial premise. > > If we agree that Accord should be independent, I'm +1 for git > submodules primarily because that's a standard way of doing things and > I don't think we need yet another bespoke solution to a problem that > hundreds, if not thousands of other software projects encounter. I've > worked with lots of projects using submodules and while they're not a > panacea, they've never been a significant problem to work with. > > It's also a little confusing to see people argue about HEAD in > relation to any of this, since that's just an alias to the latest > commit for a given branch. In every project I've worked with that uses > submodules you would never use HEAD, because the submodule itself > already records the *exact* commit associated with the parent. > > Cheers, > > Derek > > On Tue, Jan 17, 2023 at 2:28 AM Benedict <bened...@apache.org> wrote: > > > > The answer to all your questions is “like any other library” - this is a procedural hack to ease development. There are alternative isomorphic hacks, like compiling source jars from Accord and including them in the C* tree, if it helps your mental model. > > > > > you stated that a goal was to avoid maintaining multiple branches. > > > > No, I stated that a goal was to *decouple* development of Accord from C*. I don’t see why you would take that to mean there are no branches of Accord, as that would quite clearly be incompatible with the C* release strategy. > > > > > > > > On 17 Jan 2023, at 07:36, Mick Semb Wever <m...@apache.org> wrote: > > > > > >> > >> … extrapolating this e
Re: Intra-project dependencies
> You would reference the snapshot dependency by the timestamped snapshot. This > makes it a reproducible build. How confident are we that the repository will not alter or delete them? > linking in the source code into in-tree is a significant thing to do Could you explain why? I thought your preferred alternative was merging the source trees permanently > On 17 Jan 2023, at 21:40, Mick Semb Wever wrote: > > >> Regarding the use of snapshots, this isn’t impossible Henrik - I floated >> this as an option. But besides the additional overhead during development, >> this does not maintain reproducible builds, as the snapshot can change. > > You would reference the snapshot dependency by the timestamped snapshot. This > makes it a reproducible build. > > We have done this with dtest-api already, and there's already a comment > explaining it: > https://github.com/apache/cassandra/blob/trunk/.build/build-resolver.xml#L59-L60 > > > It introduces some overhead when bisecting to go from the snapshot's > timestamp to the other repo's SHA (this is easily solvable by putting the SHA > inside the jarfile). > > I don't see the problem of letting trunk use snapshots during the annual > development cycle, if we accept the overhead of cutting all library releases > before we cut the first alpha/beta. > > FTR, i'm sitting on the fence between this and submodules. There's many dev > tasks we do, and different approaches have different pain points. The amount > of dev happening in the library also matters. I also agree with Derek that > linking in the source code into in-tree is a significant thing to do, just to > avoid the rigamaroles of dependency management. > > Josh, bundling releases gets tricky in that you need to include the library > sources, because the cassandra release is essentially being voted on (because > it has been built) with non-released dependencies.
Re: Intra-project dependencies
> Linking or merging while it is still also being a separate library and repo. I am still unclear why you think this is “a significant thing”? > On 18 Jan 2023, at 10:41, Mick Semb Wever wrote: > > > > >>> You would reference the snapshot dependency by the timestamped snapshot. >>> This makes it a reproducible build. >> >> How confident are we that the repository will not alter or delete them? > > > They cannot be altered. > > I see artefacts there that are more than a decade old. But we cannot rely on > their permanence. > > Putting the SHA into the jar's manifest is easy. And this blog post shows > how you can also expose this info on the command line: > https://medium.com/liveramp-engineering/identifying-maven-snapshot-artifacts-by-git-revision-15b860d6228b > > > Given there's no guaranteed permanence to the snapshots, we would need to > have the git sha in the version, so if much older versions can't be > downloaded it can still be rebuilt. > > This is done like: 1.0.0_${sha1}-SNAPSHOT > > >>> linking in the source code into in-tree is a significant thing to do >> >> Could you explain why? I thought your preferred alternative was merging the >> source trees permanently > > > Linking or merging while it is still also being a separate library and repo. > If we are really not that interested in it as a separate library, and dev > change is high, or the code is somewhere less accessible, then in tree makes > sense IMHO. >
Re: Intra-project dependencies
point of view? - handling patches cover submodulesI don’t know what you mean by this, do you mean how do we submit cross-cutting patches? How I do this in the cep-15-accord branch is by updating the pointer to point to my dependency PR, that way the build “does the right thing”, I just have to fix this up before merging into Cassandra (have to commit in the “correct" order) - switching branches, and using git worktrees, during dvWhat is the concern her? I am using work trees for PR review and cep-15-accord development and have zero issues with this; can you expand more on this?And who would be fixing our build/test/release scripts to accommodate?100% valid question to ask. I personally am in favor of the proposer doing the work and not depend on specific CI people to do the work for them…. But cool with others helping out… I do feel its not good to depend on a single CI person to do all this; w/e it is we define.I'm thinking about reproducible builds, Is the concern that checking out a sub-modules’s SHA may not compile, breaking C*? Is there another concern here? Want to fully understandswitching between branches, This is a pain point that I feel should be handled by git hooks. We have this issue when you try to reuse the same directory for different release branches, and its super annoying when you go back in time when jars were in-tree as you need to cleanup after switching back…. I do agree that we should flesh this out as its going to be the common case, so how do we “do the right thing” should be handledand git bisectingIsn’t this just another example of switching branches? If we solve that case then doesn’t git bisect come in for free? To include forward-mergingWhat is the concern here?Rather that you need to know in advance when the SHA is not HEAD.Do you? Or do you really need to know which “branch” it is following? For example, lets say we release 5.0 then 5.1 then 5.2, and there are accord versions for each: 1.0, 1.2, 2.0… do we not need to really know which branch it is following, and only when you are trying to do a cross-cutting change?For example, if I want to fix a bug in 5.1 that modifies accord, I need to know to use the accord-1.2 branch? I think this is a solvable progress with submodules and script, but green we should think about this case as its going to come upCorrect. submodules does not solve/remove the need to commit to multiple branches and forward merge. Furthermore submodules means at least one additional commit, and possibly twice as many commits.We have 4 maintained branches atm, so if there is a bug in accord that needs to be fixed in all 4 we need4 commits for C*1 to 4 for Accord, depending on release history.If we make sure all branches are using the latest “stable” accord then this is 6 commits (4 for C*, 1 for accord the stable branch, then 1 to merge into trunk)Our current commit process is human controlled, so every commit is a chance for human error. Maybe we should look to improve this? I know I have my own script to avoid human error (which supports jvm/python dtest), maybe it would be best if the project had automation to make sure everyone “does the right thing”?On Jan 18, 2023, at 3:06 AM, Benedict <bened...@apache.org> wrote:Linking or merging while it is still also being a separate library and repo.I am still unclear why you think this is “a significant thing”?On 18 Jan 2023, at 10:41, Mick Semb Wever <m...@apache.org> wrote:You would reference the snapshot dependency by the timestamped snapshot. This makes it a reproducible build.How confident are we that the repository will not alter or delete them?They cannot be altered.I see artefacts there that are more than a decade old. But we cannot rely on their permanence. Putting the SHA into the jar's manifest is easy. And this blog post shows how you can also expose this info on the command line: https://medium.com/liveramp-engineering/identifying-maven-snapshot-artifacts-by-git-revision-15b860d6228b Given there's no guaranteed permanence to the snapshots, we would need to have the git sha in the version, so if much older versions can't be downloaded it can still be rebuilt.This is done like: 1.0.0_${sha1}-SNAPSHOT linking in the source code into in-tree is a significant thing to doCould you explain why? I thought your preferred alternative was merging the source trees permanentlyLinking or merging while it is still also being a separate library and repo.If we are really not that interested in it as a separate library, and dev change is high, or the code is somewhere less accessible, then in tree makes sense IMHO.
Re: Merging CEP-15 to trunk
These tickets have all met the standard integration requirements, so I’m just unclear what “higher pre-commit gateway” you are referring to.I think the existing epics are probably more natural tickets to reference in the merge, eg 17091 and 17092.On 20 Jan 2023, at 11:04, Mick Semb Wever wrote:On Tue, 17 Jan 2023 at 10:29, Benedict <bened...@apache.org> wrote:but the pre-commit gateway here is higher than the previous tickets being worked onWhich tickets, and why?All tickets resolved in the feature branch to which you are now bringing from feature branch into trunk. A quick scan I see… 17103, 17109, 18041, 18056, 18057, 18087, 17719, 18154, 18142, 18142.All these tickets are resolved but have not been merged to trunk, and they have no fixVersion.Assuming that it won't be a merge as-is (e.g. ninja-fixes will get squashed), i think a task ticket for the clean up of the feature branch and merge of it to trunk warrants a separate ticket. Such a ticket also helps with the admin (linking to the tickets the merge is bringing in).
Re: Merging CEP-15 to trunk
There is no merge-then-review. The work has been reviewed. This is identical to how reviews work as normal. If it helps your mental model, consider this a convenient atomic merge of many Jira that have independently met the standard project procedural requirements, as that is what it is. Squashing of commits is normal before merging a ticket, and does not typically incur an additional round of review - indeed, it’s not even clear what this would mean, since a squash has no visible effect to review. This is not a question of moving fast, but a question of process. We have out of politeness highlighted the impending merge of a lot of work. This is an invitation to engage on the relevant tickets that already exist to represent the work, not an invitation to create novel adhoc procedural steps. > On 23 Jan 2023, at 22:54, Mick Semb Wever wrote: > > >> The sooner it’s in trunk, the more eyes it will draw, IMO, if you are right >> about most contributors not having paid attention to a feature branch. > > > > We all agree we want the feature branch incrementally merged sooner rather > than later. > IMHO any merge to trunk, and any rebase and squash of ninja-fix commits, > deserves an invite to reviewers. > Any notion of merge-then-review isn't our community precedent. > > I appreciate the desire to not "be left hanging" by creating a merge ticket > that requires a reviewer when no reviewer shows. And the desire to move > quickly on this. > > I don't object if you wish to use this thread as that review process. On the > other hand, if you create the ticket I promise to be a reviewer of it, so as > not to delay. > >
Re: Merging CEP-15 to trunk
No, that is not the normal process. What is it you think you would be reviewing? There are no diffs produced as part of rebasing, and the purpose of review is to ensure code meets out standards, not that the committer is competent at rebasing or squashing. Nor are you familiar with the code as it was originally reviewed, so would have nothing to compare against. We expect a clean CI run, ordinarily, not an additional round of review. If we were to expect that, it would be by the original reviewer, not a third party, as they are the only ones able to judge the rebase efficiently. I would support investigating tooling to support reviewing rebases. I’m sure such tools and processes exist. But, we don’t have them today and it is not a normal part of the review process. If you want to modify, clarify or otherwise stipulate new standards or processes, I suggest a separate thread. > How will the existing tickets make it clear when and where their final merge > happened? By setting the release version and source control fields. > On 24 Jan 2023, at 08:43, Mick Semb Wever wrote: > > >> But it's not merge-than-review, because they've already been reviewed, >> before being merged to the feature branch, by committers (actually PMC >> members)? >> >> You want code that's been written by one PMC member and reviewed by 2 other >> PMC members to be put up for review by some random 4th party? For how long? > > > It is my hope that the work as-is is not being merged. That there is a rebase > and some trivial squashing to do. That deserves a quick check by another. > Ideally this would be one of the existing reviewers (but like any other > review step, no matter how short and trivial it is, that's still an open > process). I see others already doing this when rebasing larger patches before > the final merge. > > Will the branch be rebased and cleaned up? > How will the existing tickets make it clear when and where their final merge > happened? >
Re: Merging CEP-15 to trunk
Perhaps the disconnect is that folk assume a rebase will be difficult and have many conflicts? We have introduced primarily new code with minimal integration points, so I decided to test this. I managed to rebase locally in around five minutes; mostly imports. This is less work than for a rebase of fairly typical ticket of average complexity.Green CI is of course a requirement. There is, however, no good procedural or technical justification for a special review of the rebase.Mick is encouraged to take a look at the code before and after rebase, and will be afforded plenty of time to do so. But I will not gate merge on this adhoc requirement.On 24 Jan 2023, at 15:40, Ekaterina Dimitrova wrote:Hi everyone,I am excited to see this work merged. I noticed the branch is 395 commits behind trunk or not rebased since September last year. I think if Mick or anyone else wants to make a final pass after rebase happens and CI runs - this work can only benefit of that. Squash, rebase and full CI run green is the minimum that, if I read correctly the thread, we all agree on that part. I would say that CI and final check after a long rebase of a patch is a thing we actually do all the time even for small patches when we get back to our backlog of old patches. This doesn’t mean that the previous reviews are dismissed or people not trusted or anything like that.But considering the size and the importance of this work, I can really see only benefit of a final cross-check. As Henrik mentioned me, I am not sure I will have the chance to review this work any time soon (just setting the right expectations up front) but I see at least Mick already mentioning he would do it if there are no other volunteers. Now, whether it will be separate ticket or not, that is a different story. Aren’t the Accord tickets in an epic under which we can document the final rebase, CI runs, etc?On Tue, 24 Jan 2023 at 9:40, Henrik Ingo <henrik.i...@datastax.com> wrote:When was the last time the feature branch was rebased? Assuming it's a while back and the delta is significant, surely it's normal process to first rebase, run tests, and then present the branch for review?To answer your question: The effect of the rebase is then either baked into the original commits (which I personally dislike), or you can also have the rebase-induced changes as their own commits. (Which can get tedious, but has the benefit of making explicit what was only a change due to rebasing.) Depending on which approach you take when rebasing, a reviewer would then review accordingly.henrikOn Tue, Jan 24, 2023 at 11:14 AM Benedict <bened...@apache.org> wrote:No, that is not the normal process. What is it you think you would be reviewing? There are no diffs produced as part of rebasing, and the purpose of review is to ensure code meets out standards, not that the committer is competent at rebasing or squashing. Nor are you familiar with the code as it was originally reviewed, so would have nothing to compare against. We expect a clean CI run, ordinarily, not an additional round of review. If we were to expect that, it would be by the original reviewer, not a third party, as they are the only ones able to judge the rebase efficiently.I would support investigating tooling to support reviewing rebases. I’m sure such tools and processes exist. But, we don’t have them today and it is not a normal part of the review process. If you want to modify, clarify or otherwise stipulate new standards or processes, I suggest a separate thread.> How will the existing tickets make it clear when and where their final merge happened?By setting the release version and source control fields.On 24 Jan 2023, at 08:43, Mick Semb Wever <m...@apache.org> wrote: But it's not merge-than-review, because they've already been reviewed, before being merged to the feature branch, by committers (actually PMC members)? You want code that's been written by one PMC member and reviewed by 2 other PMC members to be put up for review by some random 4th party? For how long? It is my hope that the work as-is is not being merged. That there is a rebase and some trivial squashing to do. That deserves a quick check by another. Ideally this would be one of the existing reviewers (but like any other review step, no matter how short and trivial it is, that's still an open process). I see others already doing this when rebasing larger patches before the final merge. Will the branch be rebased and cleaned up?How will the existing tickets make it clear when and where their final merge happened? -- Henrik Ingoc. +358 40 569 7354 w. www.datastax.com
Re: Merging CEP-15 to trunk
contributors who didn't actively work on Accord, have assumed that they will be invited to review nowI may have missed it, but I have not seen anyone propose to substantively review the actual work, only the impact of rebasing. Which, honestly, there is plenty of time to do - the impact is essentially nil, and we aren’t planning to merge immediately. I will only not agree to an adhoc procedural change to prevent merge until this happens, as a matter of principle.However, I am very sympathetic to a desire to participate substantively. I think interested parties should have invested as the work progressed, but I don’t think it is unreasonable to ask for a some time prior to merge if this hasn’t happened.So, if you can adequately resource it, we can delay merging a while longer. I want your (constructive) participation. But, I have not seen anything to suggest this is even proposed, let alone realistic.There are currently five full time contributors participating in the Accord project, with cumulatively several person-years of work already accumulated. By the time even another month has passed, you will have another five person-months of work to catch-up on. Resourcing even a review effort to catch up with this is non-trivial, and for it to be a reasonable ask, you must credibly be able to keep up while making useful contributions.After all, if it had been ready to merge to trunk already a year ago, why wasn't it?The Cassandra integration has only existed since late last year, and was not merged earlier to avoid interfering with the effort to release 4.1.One thing that I wanted to ask for is when you push to CI, you or whoever does it, to approve all jobs.Thanks Ekaterina, we will be sure to fully qualify the CI result, and I will make sure we also run your flaky test runner on the newly introduced tests.On 24 Jan 2023, at 21:42, Henrik Ingo wrote:Thanks JoshSince you mentioned the CEP process, I should also mention one sentiment you omitted, but worth stating explicitly:4. The CEP itself should not be renegotiated at this point. However, the reviewers should rather focus on validating that the implementation matches the CEP. (Or if not, that the deviation is of a good reason and the reviewer agrees to approve it.)Although I'm not personally full time working on either producing Cassandra code or reviewing it, I'm going to spend one more email defending your point #1, because I think your proposal would lead to a lot of inefficiencies in the project, and that does happen to be my job to care about: - Even if you could be right, from some point of view, it's nevertheless the case that those contributors who didn't actively work on Accord, have assumed that they will be invited to review now, when the code is about to land in trunk. Not allowing that to happen would make them feel like they weren't given the opportunity and that the process in Cassandra Project Governance was bypassed. We can agree to work differently in the future, but this is the reality now. - Although those who have collaborated on Accord testify that the code is of the highest quality and ready to be merged to trunk, I don't think that can be expected of every feature branch all the time. In fact, I'm pretty sure the opposite must have been the case also for the Accord branch at some point. After all, if it had been ready to merge to trunk already a year ago, why wasn't it? It's kind of the point of using a feature branch that the code in it is NOT ready to be merged yet. (For example, the existing code might be of high quality, but the work is incomplete, so it shouldn't be merged to trunk.) - Uncertainty: It's completely ok that some feature branches may be abandoned without ever merging to trunk. Requiring the community (anyone potentially interested, anyways) to review such code would obviously be a waste of precious talent. - Time uncertainty: Also - and this is also true for Accord - it is unknown when the merge will happen if it does. In the case of Accord it is now over a year since the CEP was adopted. If I remember correctly an initial target date for some kind of milestone may have been Summer of 2022? Let's say someone in October 2021 was invested in the quality of Cassandra 4.1 release. Should this person now invest in reviewing Accord or not? It's impossible to know. Again, in hindsight we know that the answer is no, but your suggestion again would require the person to review all active feature branches just in case.As for 2 and 3, I certainly observe an assumption that contributors have expected to review after a rebase. But I don't see this as a significant topic to argue about. If indeed the rebase is as easy as Benedict advertised, then we should just do the rebase because apparently it can be done faster than it took me to write this email :-) (But yes, conversely, it seems then that the rebase is not a big reason to hold off fr
Re: Merging CEP-15 to trunk
ve learned that when I have defended the need (or right, if appealing to the Governance texts...) for contributors to be able to review a feature branch at the time it is merged to trunk - which for Accord is now - that a common reaction to this is that doing a review of Accord now might take months and would stall the Accord project for months if that is allowed.So I just wanted to clarify I don't think that sounds "reasonable", as the word is used in the Governance wiki page. I agree that to engage in such level of review, it would have needed to happen earlier. On the other hand, I can think of many things that a pair of fresh eyes can do at this point in reasonable time, like days or a couple weeks. I spent 6 hours this week glancing over the 28k lines of code that would be added to C* codebase. I was able to form an opinion of the patch, have some fruitful off-list conversations with several people, and as a by-product apparently also caught some commented out code that possibly should be enabled before the merge.henrikOn Wed, Jan 25, 2023 at 5:06 PM Henrik Ingo <henrik.i...@datastax.com> wrote:Thanks BenedictFor brevity I'll respond to your email, although indirectly this is also a continuation of my debate with Josh:At least on my scorecard, one issue was raised regarding the actual code: CASSANDRA-18193 Provide design and API documentation. Since the addition of code comments also significantly impacts the ability of an outsider to understand and review the code, I would then treat it as an unknown to say how much else such a fresh review would uncover.By the way I would say the discussion about git submodules (and all the other alternatives) in a broad sense was also a review'ish comment.That said, yes of course the expectation is that if the code has already been reviewed, and by rather experienced Cassandra developers too, there probably won't be many issues found, and there isn't a need for several weeks of line by line re-review.As for the rebase, I think that somehow started dominating this discussion, but in my view was never the only reason. For me this is primarily to satisfy points 4 and 5 in the project governance, that everyone has had an opportunity to review the code, for whatever reason they may wish to do so.I should say for those of us on the sidelines we certainly expected a rebase catching up 6 months and ~500 commits to have more substantial changes. Hearing that this is not the case is encouraging, as it also suggests the changes to Cassandra code are less invasive than maybe I and others had imagined.henrikOn Wed, Jan 25, 2023 at 1:51 PM Benedict <bened...@apache.org> wrote:contributors who didn't actively work on Accord, have assumed that they will be invited to review nowI may have missed it, but I have not seen anyone propose to substantively review the actual work, only the impact of rebasing. Which, honestly, there is plenty of time to do - the impact is essentially nil, and we aren’t planning to merge immediately. I will only not agree to an adhoc procedural change to prevent merge until this happens, as a matter of principle.However, I am very sympathetic to a desire to participate substantively. I think interested parties should have invested as the work progressed, but I don’t think it is unreasonable to ask for a some time prior to merge if this hasn’t happened.So, if you can adequately resource it, we can delay merging a while longer. I want your (constructive) participation. But, I have not seen anything to suggest this is even proposed, let alone realistic.There are currently five full time contributors participating in the Accord project, with cumulatively several person-years of work already accumulated. By the time even another month has passed, you will have another five person-months of work to catch-up on. Resourcing even a review effort to catch up with this is non-trivial, and for it to be a reasonable ask, you must credibly be able to keep up while making useful contributions.After all, if it had been ready to merge to trunk already a year ago, why wasn't it?The Cassandra integration has only existed since late last year, and was not merged earlier to avoid interfering with the effort to release 4.1.One thing that I wanted to ask for is when you push to CI, you or whoever does it, to approve all jobs.Thanks Ekaterina, we will be sure to fully qualify the CI result, and I will make sure we also run your flaky test runner on the newly introduced tests.On 24 Jan 2023, at 21:42, Henrik Ingo <henrik.i...@datastax.com> wrote:Thanks JoshSince you mentioned the CEP process, I should also mention one sentiment you omitted, but worth stating explicitly:4. The CEP itself should not be renegotiated at this point. However, the reviewers should rather focus on validating that the implementation matches the CEP. (Or if not, that the deviation is of a good reason and the reviewer agrees to approve it.)Althoug
Internal Documentation Contribution Guidance
During public and private discussions around CEP-15, it became apparent that we lack any guidance on internal documentation and commenting in the “style guide” - which I also propose we rename to “Contribution Guide” or “Contribution and Style Guide" to better describe the broader role it has taken on. I have put together a starting proposal to kick us off here: https://docs.google.com/document/d/1qt7xmvEAXm_w9ARb2lVutapB_3il2WWEWuq_zYlu_9s There might also be some value in working on some more general guidance on processes like commit, review etc. that also came up. I haven’t tried to address these here, but if anyone wants to work on that I’d be happy to participate.
Re: Internal Documentation Contribution Guidance
Apologies, I think it should be opened up for comments now.On 30 Jan 2023, at 11:29, Ekaterina Dimitrova wrote:Thank you Benedict! Can you, please, give us comment access to the doc?On Mon, 30 Jan 2023 at 6:14, Benedict <bened...@apache.org> wrote:During public and private discussions around CEP-15, it became apparent that we lack any guidance on internal documentation and commenting in the “style guide” - which I also propose we rename to “Contribution Guide” or “Contribution and Style Guide" to better describe the broader role it has taken on.I have put together a starting proposal to kick us off here: https://docs.google.com/document/d/1qt7xmvEAXm_w9ARb2lVutapB_3il2WWEWuq_zYlu_9sThere might also be some value in working on some more general guidance on processes like commit, review etc. that also came up. I haven’t tried to address these here, but if anyone wants to work on that I’d be happy to participate.
Re: Merging CEP-15 to trunk
Review should primarily ask: "is this correct?" and "could this be done differently (clearer, faster, more correct, etc)?" Blocking reviews especially, because why else would a reasonable contributor want to block progress?These questions can of course be asked of implementation details for any CEP. I have said before, a proposal to conduct a blocking review of this kind - if very late in my view - would be valid, though timeline would have to be debated.Reviewers with weaker aspirations have plenty of time available to them - two weeks have already passed, and another couple will likely yet (there isn't a rush). But it is novel to propose that such optional reviews be treated as blocking.On 30 Jan 2023, at 23:04, Henrik Ingo wrote:Ooops, I missed copy pasting this reply into my previous email:On Fri, Jan 27, 2023 at 11:21 PM Benedict <bened...@apache.org> wrote:I'm realizing in retrospect this leaves ambiguityAnother misreading at least of the intent of these clauses, is that they were to ensure that concerns about a design and approach are listened to, and addressed to the satisfaction of interested parties. It was essentially codifying the project’s long term etiquette around pieces of work with either competing proposals or fundamental concerns. It has historically helped to avoid escalation to vetoes, or reverting code after commit. It wasn’t intended that any reason might be invoked, as seems to have been inferred, and perhaps this should be clarified, though I had hoped it would be captured by the word “reasonable". Minor concerns that haven’t been caught by the initial review process can always be addressed in follow-up work, as is very common.Wouldn't you expect such concerns to at least partially now have been covered in the CEP discussion, up front? I would expect at most at this stage someone could validate that the implementation follows the CEP. But I wouldn't expect a debate on competing approaches at this stage.henrik-- Henrik Ingoc. +358 40 569 7354 w. www.datastax.com
Re: Merging CEP-15 to trunk
Do you mean as part of a blocking review, or just in general?I don’t honestly understand the focus on ninja fixes or rebasing, in either context. Why would a project rebase its work until ready to merge? Why would you worry that a team well resourced with experienced contributors (30+yrs between them) don’t know how to work correctly with ninja fixes? These are all minor details, surely?I understand your concern around flaky tests, particularly since it seems other work has let some slip through. I don’t believe this is a blocking review concern, as it represents its own blocking requirement. I believe I have responded positively to this query already though?On 31 Jan 2023, at 01:46, Ekaterina Dimitrova wrote:Benedict, Is it an issue to ask whether flaky tests will be addressed as per our project agreement? Or about ninja fixes and why a branch was not rebased during development? What did I miss? By the way I do not ask these questions to block anyone, even suggested to help with CI…it’s a pity this part was dismissed… I can see that Caleb is handling the things around the merge ticket with high attention to the details as always. I can only thank him! At this point I see this thread mostly as - this is the first feature branch since quite some time, people are unsure about certain things - let’s see how we handle this for the next time to be more efficient and clear. I think you already took some actions in your suggestion earlier today with the document around comments. On Mon, 30 Jan 2023 at 20:16, Benedict <bened...@apache.org> wrote:Review should primarily ask: "is this correct?" and "could this be done differently (clearer, faster, more correct, etc)?" Blocking reviews especially, because why else would a reasonable contributor want to block progress?These questions can of course be asked of implementation details for any CEP. I have said before, a proposal to conduct a blocking review of this kind - if very late in my view - would be valid, though timeline would have to be debated.Reviewers with weaker aspirations have plenty of time available to them - two weeks have already passed, and another couple will likely yet (there isn't a rush). But it is novel to propose that such optional reviews be treated as blocking.On 30 Jan 2023, at 23:04, Henrik Ingo <henrik.i...@datastax.com> wrote:Ooops, I missed copy pasting this reply into my previous email:On Fri, Jan 27, 2023 at 11:21 PM Benedict <bened...@apache.org> wrote:I'm realizing in retrospect this leaves ambiguityAnother misreading at least of the intent of these clauses, is that they were to ensure that concerns about a design and approach are listened to, and addressed to the satisfaction of interested parties. It was essentially codifying the project’s long term etiquette around pieces of work with either competing proposals or fundamental concerns. It has historically helped to avoid escalation to vetoes, or reverting code after commit. It wasn’t intended that any reason might be invoked, as seems to have been inferred, and perhaps this should be clarified, though I had hoped it would be captured by the word “reasonable". Minor concerns that haven’t been caught by the initial review process can always be addressed in follow-up work, as is very common.Wouldn't you expect such concerns to at least partially now have been covered in the CEP discussion, up front? I would expect at most at this stage someone could validate that the implementation follows the CEP. But I wouldn't expect a debate on competing approaches at this stage.henrik-- Henrik Ingoc. +358 40 569 7354 w. www.datastax.com
Re: [DISCUSS] API modifications and when to raise a thread on the dev ML
If we expose internal mechanics to public APIs, we should probably document what our promises are - ie none, even between patch versions.So we can close out this discussion, let’s assume we’re only discussing any interfaces we want to make promises for. We can have a separate discussion about which those are if there is any disagreement.On 2 Feb 2023, at 10:16, Andrés de la Peña wrote:I guess that depends on the type of change, and what we consider an API.If it's a breaking change, like removing a method or property, I think we would need a DISCUSS API thread prior to making changes. However, if the change is an addition, like adding a new yaml property or a JMX method, I think JIRA suffices.As for what we consider a public API, we usually include extensible interfaces on this. For example, we can agree that the Index interface for secondary indexes is a public API. However, that interface exposes many other interfaces and classes through its methods. For example, that Index interface exposes ColumnFamillyStore, SSTableReader, ColumnMetadata, AbstractType, PartitionUpdate, etc. Changes into those indirectly exposed classes can easily break custom index implementations out there. Should we consider them public APIs too, and require a DISCUSS thread for every change on them? Should that include new methods that wouldn't break compatibility?On Thu, 2 Feb 2023 at 09:29, Benedict Elliott Smith <bened...@apache.org> wrote:Closing the loop on seeking consensus for UX/UI/API changes, I see a few options. Can we rank choice vote please?A - Jira sufficesB - Post a DISCUSS API thread prior to making changesC - Periodically publish a list of API changes for retrospective consideration by the communityPoints raised in the discussion included: lowering the bar for participation and avoiding unnecessary burden to developers.I vote B (only) because I think broader participation is most important for these topics.On 7 Dec 2022, at 15:43, Mick Semb Wever <m...@apache.org> wrote:I think it makes sense to look into improving visibility of API changes, so people can more easily review a summary of API changes versus reading through the whole changelog (perhaps we need a summarized API change log?).Agree Paulo.Observers should be able to see all API changes early. We can do better than telling downstream users/devs "you have to listen to all jira tickets" or "you have to watch the code and pick up changes". Watching CHANGES.txt or NEWS.txt or CEPs doesn't solve the need either. Observing such changes as early as possible can save a significant amount of effort and headache later on, and should be encouraged. If done correctly I can imagine it will help welcome more contributors.I can also see that we can improve at, and have a better shared understanding of, categorising the types of API changes: addition/change/deprecation/removal, signature/output/behavioural, API/SPI. So I can see value here for both observers and for ourselves.
Re: Cassandra 5.0 Documentation Plan
This looks good to me, thanks Lorina > On 1 Feb 2023, at 19:24, Lorina Poland wrote: > > > Hey all - > > I presented a potential Docs Information Architecture recently, and promised > a Doc Plan for the upcoming C* 5.0 release. Please give me feedback, > especially if you feel that the priorities for upcoming features should be > different than what I propose. Also understand that the priority is just to > make absolutely certain that the most impactful features are covered at GA. > Eventually, and hopefully also before GA, all the items listed in the Doc > Plan will be completed. Just depends on the effort devoted by the community, > myself included. > > Here's the link: > https://docs.google.com/document/d/1FAACcAxtV9qLJS05RLt85S_6Gb4C4t4g9DhmfLufjOk/edit?usp=sharing > > Happy reading! > > Lorina >
Re: [DISCUSS] API modifications and when to raise a thread on the dev ML
I think it’s fine to separate the systems from the policy? We are agreeing a policy for systems we want to make guarantees about to our users (regarding maintenance and compatibility)For me, this is (at minimum) CQL and virtual tables. But I don’t think the policy differs based on the contents of the list, and given how long this topic stalled for. Given the primary point of contention seems to be the *policy* and not the list, I think it’s time to express our opinions numerically so we can move the conversation forwards.This isn’t binding, it just reifies the community sentiment.On 2 Feb 2023, at 13:02, Ekaterina Dimitrova wrote:“ So we can close out this discussion, let’s assume we’re only discussing any interfaces we want to make promises for. We can have a separate discussion about which those are if there is any disagreement.”May I suggest we first clear this topic and then move to voting? I would say I see confusion, not that much of a disagreement. Should we raise a discussion for every feature flag for example? In another thread virtual tables were brought in. I saw also other examples where people expressed uncertainty. I personally feel I’ll be able to take a more informed decision and vote if I first see this clarified. I will be happy to put down a document and bring it for discussion if people agree with thatOn Thu, 2 Feb 2023 at 7:33, Aleksey Yeshchenkowrote:Bringing light to new proposed APIs no less important - if not more, for reasons already mentioned in this thread. For it’s not easy to change them later.Voting B.On 2 Feb 2023, at 10:15, Andrés de la Peña wrote:If it's a breaking change, like removing a method or property, I think we would need a DISCUSS API thread prior to making changes. However, if the change is an addition, like adding a new yaml property or a JMX method, I think JIRA suffices.
Re: [DISCUSS] API modifications and when to raise a thread on the dev ML
I think lazy consensus is fine for all of these things. If a DISCUSS thread is crickets, or just positive responses, then definitely it can proceed without further ceremony.I think “with heads-up to the mailing list” is very close to B? Only that it locks out of the conversation anyone without a Jira login.On 2 Feb 2023, at 13:46, Ekaterina Dimitrova wrote:While I do agree with you, I am thinking that if we include many things that we would expect lazy consensus on I would probably have different preference. I definitely don’t mean to stall this though so in that case:I’d say combination of A+C (jira with heads up on the ML if someone is interested into the jira) and regular log on API changes separate from CHANGES.txt or we can just add labels to entries in CHANGES.txt as some other projects. (I guess this is a detail we can agree on later on, how to implement it, if we decide to move into that direction)On Thu, 2 Feb 2023 at 8:12, Benedict <bened...@apache.org> wrote:I think it’s fine to separate the systems from the policy? We are agreeing a policy for systems we want to make guarantees about to our users (regarding maintenance and compatibility)For me, this is (at minimum) CQL and virtual tables. But I don’t think the policy differs based on the contents of the list, and given how long this topic stalled for. Given the primary point of contention seems to be the *policy* and not the list, I think it’s time to express our opinions numerically so we can move the conversation forwards.This isn’t binding, it just reifies the community sentiment.On 2 Feb 2023, at 13:02, Ekaterina Dimitrova <e.dimitr...@gmail.com> wrote:“ So we can close out this discussion, let’s assume we’re only discussing any interfaces we want to make promises for. We can have a separate discussion about which those are if there is any disagreement.”May I suggest we first clear this topic and then move to voting? I would say I see confusion, not that much of a disagreement. Should we raise a discussion for every feature flag for example? In another thread virtual tables were brought in. I saw also other examples where people expressed uncertainty. I personally feel I’ll be able to take a more informed decision and vote if I first see this clarified. I will be happy to put down a document and bring it for discussion if people agree with thatOn Thu, 2 Feb 2023 at 7:33, Aleksey Yeshchenko <alek...@apple.com> wrote:Bringing light to new proposed APIs no less important - if not more, for reasons already mentioned in this thread. For it’s not easy to change them later.Voting B.On 2 Feb 2023, at 10:15, Andrés de la Peña <adelap...@apache.org> wrote:If it's a breaking change, like removing a method or property, I think we would need a DISCUSS API thread prior to making changes. However, if the change is an addition, like adding a new yaml property or a JMX method, I think JIRA suffices.
Re: [DISCUSS] API modifications and when to raise a thread on the dev ML
I’m inclined to agree with Jeremiah and Patrick, even if it might be a steep jump from what we’ve been used to. CQL needs to be treated with the utmost care and attention to detail. Would we apply the same standard to semantic bug fixes too? Perhaps we should, where there isn’t an urgent timeline at least.Once the dust settles from this vote, depending how it tabulates, we can see what topics like this we might need another round to nail down.On 2 Feb 2023, at 15:56, Jeremiah D Jordan wrote:I think we need a DISCUSS thread at minimum for API changes. And for anything changing CQL syntax, I think a CEP is warranted. Even if it is only a small change to the syntax.On Feb 2, 2023, at 9:32 AM, Patrick McFadin wrote:API changes are near and dear to my world. The scope of changes could be minor or major, so I think B is the right way forward. Not to throw off the momentum, but could this even warrant a separate CEP in some cases? For example, CEP-15 is a huge change, but the CQL syntax will continuously evolve with more use. Being judicious in those changes is good for end users. It's also a good reference to point back to after the fact. PatrickOn Thu, Feb 2, 2023 at 6:01 AM Ekaterina Dimitrova <e.dimitr...@gmail.com> wrote:“ Only that it locks out of the conversation anyone without a Jira login”Very valid point I forgot about - since recently people need invitation in order to create account…Then I would say C until we clarify the scope. ThanksOn Thu, 2 Feb 2023 at 8:54, Benedict <bened...@apache.org> wrote:I think lazy consensus is fine for all of these things. If a DISCUSS thread is crickets, or just positive responses, then definitely it can proceed without further ceremony.I think “with heads-up to the mailing list” is very close to B? Only that it locks out of the conversation anyone without a Jira login.On 2 Feb 2023, at 13:46, Ekaterina Dimitrova <e.dimitr...@gmail.com> wrote:While I do agree with you, I am thinking that if we include many things that we would expect lazy consensus on I would probably have different preference. I definitely don’t mean to stall this though so in that case:I’d say combination of A+C (jira with heads up on the ML if someone is interested into the jira) and regular log on API changes separate from CHANGES.txt or we can just add labels to entries in CHANGES.txt as some other projects. (I guess this is a detail we can agree on later on, how to implement it, if we decide to move into that direction)On Thu, 2 Feb 2023 at 8:12, Benedict <bened...@apache.org> wrote:I think it’s fine to separate the systems from the policy? We are agreeing a policy for systems we want to make guarantees about to our users (regarding maintenance and compatibility)For me, this is (at minimum) CQL and virtual tables. But I don’t think the policy differs based on the contents of the list, and given how long this topic stalled for. Given the primary point of contention seems to be the *policy* and not the list, I think it’s time to express our opinions numerically so we can move the conversation forwards.This isn’t binding, it just reifies the community sentiment.On 2 Feb 2023, at 13:02, Ekaterina Dimitrova <e.dimitr...@gmail.com> wrote:“ So we can close out this discussion, let’s assume we’re only discussing any interfaces we want to make promises for. We can have a separate discussion about which those are if there is any disagreement.”May I suggest we first clear this topic and then move to voting? I would say I see confusion, not that much of a disagreement. Should we raise a discussion for every feature flag for example? In another thread virtual tables were brought in. I saw also other examples where people expressed uncertainty. I personally feel I’ll be able to take a more informed decision and vote if I first see this clarified. I will be happy to put down a document and bring it for discussion if people agree with thatOn Thu, 2 Feb 2023 at 7:33, Aleksey Yeshchenko <alek...@apple.com> wrote:Bringing light to new proposed APIs no less important - if not more, for reasons already mentioned in this thread. For it’s not easy to change them later.Voting B.On 2 Feb 2023, at 10:15, Andrés de la Peña <adelap...@apache.org> wrote:If it's a breaking change, like removing a method or property, I think we would need a DISCUSS API thread prior to making changes. However, if the change is an addition, like adding a new yaml property or a JMX method, I think JIRA suffices.
Re: Implicitly enabling ALLOW FILTERING on virtual tables
Why not introduce a general table option that toggles ALLOW FILTERING behaviour and just flip it for virtual tables we want this behaviour for? Users can do it too, for their own tables for which it’s suitable.On 3 Feb 2023, at 20:59, Andrés de la Peña wrote:For those eventual big virtual tables there is the mentioned flag indicating whether the table allows filtering without AF.I guess the question is how can a user know whether a certain virtual table is one of the big ones. That could be specified in the doc for each table, and it could also be included in the table properties, so it's displayed by DESCRIBE TABLE queries.On Fri, 3 Feb 2023 at 20:56, Chris Lohfinkwrote:Just to 2nd what Scott days. While everything is in memory now, it may not be in the future, and if we add it implicitly, we are tying ourselves to be in memory only. However, I wouldn't -1 the idea. Another option may be a cqlsh option (ie like expand on/off) to always include a flag so it doesnt need to be added or something.ChrisOn Fri, Feb 3, 2023 at 1:24 PM C. Scott Andreas wrote:There are some ideas that development community members have kicked around that may falsify the assumption that "virtual tables are tiny and will fit in memory."One example is CASSANDRA-14629: Abstract Virtual Table for very large result setshttps://issues.apache.org/jira/browse/CASSANDRA-14629Chris's proposal here is to enable query results from virtual tables to be streamed to the client rather than being fully materialized. There are some neat possibilities suggested in this ticket, such as debug functionality to dump the contents of a raw SSTable via the CQL interface, or the contents of the database's internal caches. One could also imagine a feature like this providing functionality similar to a foreign data wrapper in other databases.I don't think the assumption that "virtual tables will always be small and always fit in memory" is a safe one.I don't think we should implicitly add "ALLOW FILTERING" to all queries against virtual tables because of this, in addition to concern with departing from standard CQL semantics for a type of tables deemed special.– ScottOn Feb 3, 2023, at 6:52 AM, Maxim Muzafarov wrote:Hello Stefan,Regarding the decision to implicitly enable ALLOW FILTERING forvirtual tables, which also makes sense to me, it may be necessary toconsider changing the clustering columns in the virtual table metadatato regular columns as well. The reasons are the same as mentionedearlier: the virtual tables hold their data in memory, thus we do notbenefit from the advantages of ordered data (e.g. the ClientsTable andits ClusteringColumn(PORT)).Changing the clustering column to a regular column may simplify thevirtual table data model, but I'm afraid it may affect users who relyon the table metadata.On Fri, 3 Feb 2023 at 12:32, Andrés de la Peña wrote:I think removing the need for ALLOW FILTERING on virtual tables makes sense and would be quite useful for operators.That guard exists for performance issues that shouldn't occur on virtual tables. We also have a flag in case some future virtual table implementation has limitations regarding filtering, although it seems it's not the case with any of the existing virtual tables.It is not like we would promote bad habits because virtual tables are meant to be queried by operators / administrators only.It might even be quite the opposite, since in the current situation users might get used to routinely use ALLOW FILTERING for querying their virtual tables.It has been mentioned on the #cassandra-dev Slack thread where this started (1) that it's kind of an API inconsistency to allow querying by non-primary keys on virtual tables without ALLOW FILTERING, whereas it's required for regular tables. I think that a simply doc update saying that virtual tables, which are not regular tables, support filtering would be enough. Virtual tables are well identified by both the keyspace they belong to and doc, so users shouldn't have trouble knowing whether a table is virtual. It would be similar to the current exception for ALLOW FILTERING, where one needs to use it unless the table has an index for the queried column.(1) https://the-asf.slack.com/archives/CK23JSY2K/p1675352759267329On Fri, 3 Feb 2023 at 09:09, Miklosovic, Stefan wrote:Hi list,the content of virtual tables is held in memory (and / or is fetched every time upon request). While doing queries against such table for a column outside of primary key, normally, users are required to specify ALLOW FILTERING. This makes total sense for "ordinary tables" for applications to have performant and effective queries but it kinds of loses the applicability for virtual tables when it literally holds just handful of entries in memory and it just does not matter, does it?What do you think about implicitly allowing filtering for virtual tables so we save
Re: [VOTE] CEP-21 Transactional Cluster Metadata
+1On 6 Feb 2023, at 16:17, Brandon Williams wrote:+1On Mon, Feb 6, 2023, 10:15 AM Sam Tunnicliffewrote:Hi everyone,I would like to start a vote on this CEP.Proposal:https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21%3A+Transactional+Cluster+MetadataDiscussion:https://lists.apache.org/thread/h25skwkbdztz9hj2pxtgh39rnjfzckk7The vote will be open for 72 hours.A vote passes if there are at least three binding +1s and no binding vetoes.Thanks,Sam
Re: Downgradability
In a self-organising community, things that aren’t self-policed naturally end up policed in an adhoc manner, and with difficulty. I’m not sure that’s the same as arbitrary enforcement. It seems to me the real issue is nobody noticed this was agreed and/or forgot and didn’t think about it much. But, even without any prior agreement, it’s perfectly reasonable to request that things do not break compatibility if they do not need to, as part of the normal patch integration process.Issues with 3.1->4.0 aren’t particularly relevant as they predate any agreement to do this. But we can and should address the problem of new columns in schema tables, as this happens often between versions. I’m not sure it has in 4.1 though?Regarding downgrade versions, surely this should simply be the same as upgrade versions we support?On 20 Feb 2023, at 20:02, Jeff Jirsa wrote:I'm not even convinced even 8110 addresses this - just writing sstables in old versions won't help if we ever add things like new types or new types of collections without other control abilities. Claude's other email in another thread a few hours ago talks about some of these surprises - "Specifically during the 3.1 -> 4.0 changes a column broadcast_port was added to system/local. This means that 3.1 system can not read the table as it has no definition for it. I tried marking the column for deletion in the metadata and in the serialization header. The later got past the column not found problem, but I suspect that it just means that data columns after broadcast_port shifted and so incorrectly read." - this is a harder problem to solve than just versioning sstables and network protocols. Stepping back a bit, we have downgrade ability listed as a goal, but it's not (as far as I can tell) universally enforced, nor is it clear at which point we will be able to concretely say "this release can be downgraded to X". Until we actually define and agree that this is a real goal with a concrete version where downgrade-ability becomes real, it feels like things are somewhat arbitrarily enforced, which is probably very frustrating for people trying to commit work/tickets.- JeffOn Mon, Feb 20, 2023 at 11:48 AM Dinesh Joshiwrote:I’m a big fan of maintaining backward compatibility. Downgradability implies that we could potentially roll back an upgrade at any time. While I don’t think we need to retain the ability to downgrade in perpetuity it would be a good objective to maintain strict backward compatibility and therefore downgradability until a certain point. This would imply versioning metadata and extending it in such a way that prior version(s) could continue functioning. This can certainly be expensive to implement and might bloat on-disk storage. However, we could always offer an option for the operator to optimize the on-disk structures for the current version then we can rewrite them in the latest version. This optimizes the storage and opens up new functionality. This means new features that can work with old on-disk structures will be available while others that strictly require new versions of the data structures will be unavailable until the operator migrates to the new version. This migration IMO should be irreversible. Beyond this point the operator will lose the ability to downgrade which is ok.DineshOn Feb 20, 2023, at 10:40 AM, Jake Luciani wrote:There has been progress on https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-8928Which is similar to what datastax does for DSE. Would this be an acceptable solution?Jake On Mon, Feb 20, 2023 at 11:17 AM guo Maxwell wrote:It seems “An alternative solution is to implement/complete CASSANDRA-8110” can give us more options if it is finished😉Branimir Lambov 于2023年2月20日 周一下午11:03写道:Hi everyone,There has been a discussion lately about changes to the sstable format in the context of being able to abort a cluster upgrade, and the fact that changes to sstables can prevent downgraded nodes from reading any data written during their temporary operation with the new version.Most of the discussion is in CASSANDRA-18134, and is spreading into CASSANDRA-14277 and CASSANDRA-17698, none of which is a good place to discuss the topic seriously.Downgradability is a worthy goal and is listed in the current roadmap. I would like to open a discussion here on how it would be achieved.My understanding of what has been suggested so far translates to:- avoid changes to sstable formats;- if there are changes, implement them in a way that is backwards-compatible, e.g. by duplicating data, so that a new version is presented in a component or portion of a component that legacy nodes will not try to read;- if the latter is not feasible, make sure the changes are only applied if a feature flag has been enabled.To me this approach introduces several risks:- it bloats file and parsing complexity;- it discourages improvement (e.g. CASSAND
Re: Downgradability
FWIW I think 8110 is the right approach, even if it isn’t a panacea. We will have to eventually also tackle system schema changes (probably not hard), and may have to think a little carefully about other things, eg with TTLs the format change is only the contract about what values can be present, so we have to make sure the data validity checks are consistent with the format we write. It isn’t as simple as writing an earlier version in this case (unless we permit truncating the TTL, perhaps) On 20 Feb 2023, at 20:24, Benedict wrote:In a self-organising community, things that aren’t self-policed naturally end up policed in an adhoc manner, and with difficulty. I’m not sure that’s the same as arbitrary enforcement. It seems to me the real issue is nobody noticed this was agreed and/or forgot and didn’t think about it much. But, even without any prior agreement, it’s perfectly reasonable to request that things do not break compatibility if they do not need to, as part of the normal patch integration process.Issues with 3.1->4.0 aren’t particularly relevant as they predate any agreement to do this. But we can and should address the problem of new columns in schema tables, as this happens often between versions. I’m not sure it has in 4.1 though?Regarding downgrade versions, surely this should simply be the same as upgrade versions we support?On 20 Feb 2023, at 20:02, Jeff Jirsa wrote:I'm not even convinced even 8110 addresses this - just writing sstables in old versions won't help if we ever add things like new types or new types of collections without other control abilities. Claude's other email in another thread a few hours ago talks about some of these surprises - "Specifically during the 3.1 -> 4.0 changes a column broadcast_port was added to system/local. This means that 3.1 system can not read the table as it has no definition for it. I tried marking the column for deletion in the metadata and in the serialization header. The later got past the column not found problem, but I suspect that it just means that data columns after broadcast_port shifted and so incorrectly read." - this is a harder problem to solve than just versioning sstables and network protocols. Stepping back a bit, we have downgrade ability listed as a goal, but it's not (as far as I can tell) universally enforced, nor is it clear at which point we will be able to concretely say "this release can be downgraded to X". Until we actually define and agree that this is a real goal with a concrete version where downgrade-ability becomes real, it feels like things are somewhat arbitrarily enforced, which is probably very frustrating for people trying to commit work/tickets.- JeffOn Mon, Feb 20, 2023 at 11:48 AM Dinesh Joshi <djo...@apache.org> wrote:I’m a big fan of maintaining backward compatibility. Downgradability implies that we could potentially roll back an upgrade at any time. While I don’t think we need to retain the ability to downgrade in perpetuity it would be a good objective to maintain strict backward compatibility and therefore downgradability until a certain point. This would imply versioning metadata and extending it in such a way that prior version(s) could continue functioning. This can certainly be expensive to implement and might bloat on-disk storage. However, we could always offer an option for the operator to optimize the on-disk structures for the current version then we can rewrite them in the latest version. This optimizes the storage and opens up new functionality. This means new features that can work with old on-disk structures will be available while others that strictly require new versions of the data structures will be unavailable until the operator migrates to the new version. This migration IMO should be irreversible. Beyond this point the operator will lose the ability to downgrade which is ok.DineshOn Feb 20, 2023, at 10:40 AM, Jake Luciani <jak...@gmail.com> wrote:There has been progress on https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-8928Which is similar to what datastax does for DSE. Would this be an acceptable solution?Jake On Mon, Feb 20, 2023 at 11:17 AM guo Maxwell <cclive1...@gmail.com> wrote:It seems “An alternative solution is to implement/complete CASSANDRA-8110” can give us more options if it is finished😉Branimir Lambov <blam...@apache.org>于2023年2月20日 周一下午11:03写道:Hi everyone,There has been a discussion lately about changes to the sstable format in the context of being able to abort a cluster upgrade, and the fact that changes to sstables can prevent downgraded nodes from reading any data written during their temporary operation with the new version.Most of the discussion is in CASSANDRA-18134, and is spreading into CASSANDRA-14277 and CASSANDRA-17698, none of which is a good place to discuss the topic seriously.Downgradability is a worthy goal and is listed in the current roadmap.
Re: Downgradability
This isn’t a feature in any normal sense? It’s a commitment, and it requires every contributor to consider it as part of work they produce.I agree we should put in some work to make this easier, and to provide some test guarantees that we don’t break it. But are we overstating the difficulty? We already use the descriptor version for serialising the contents, it’s only the metadata that ignores this today - and downgrade tests should be as simple as making upgrade tests go backwards.But in the meantime none of that seems like an excuse to fail to address this in work that may break it, where it is not particularly costly to do so.On 21 Feb 2023, at 08:57, Branimir Lambov wrote:It appears to me that the first thing we need to start this feature off is a definition of a suite of tests together with a set of rules to keep the suite up to date with new features as they are introduced. The moment that suite is in place, we can start having some confidence that we can enforce downgradability.Something like this will definitely catch incompatibilities in SSTable formats (such as the one in CASSANDRA-17698 that I managed to miss during review), but will also be able to identify incompatible system schema changes among others, and at the same time rightfully ignore non-breaking changes such as modifications to the key cache serialization formats.Is downgradability in scope for 5.0? It is a feature like any other, and I don't see any difficulty adding it (with support for downgrade to 4.x) a little later in the 5.x timeline.Regards,BranimirOn Tue, Feb 21, 2023 at 9:40 AM Jacek Lewandowski <lewandowski.ja...@gmail.com> wrote:I'd like to mention CASSANDRA-17056 (CEP-17) here as it aims to introduce multiple sstable formats support. It allows for providing an implementation of SSTableFormat along with SSTableReader and SSTableWriter. That could be extended easily to support different implementations for certain version ranges, like one impl for ma-nz, other for oa+, etc. without having a confusing implementation with a lot of conditional blocks. Old formats in such case could be maintained separately from the main code and easily switched any time. thanks- - -- --- - -Jacek Lewandowskiwt., 21 lut 2023 o 01:46 Yuki Morishita <yu...@apache.org> napisał(a):Hi,What I wanted to address in my comment in CASSANDRA-8110(https://issues.apache.org/jira/browse/CASSANDRA-8110?focusedCommentId=17641705&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17641705) is to focus on better upgrade experience.Upgrading the cluster can be painful for some orgs with mission critical Cassandra cluster, where they cannot tolerate less availability because of the inability to replace the downed node.They also need to plan rolling back to the previous state when something happens along the way.The change I proposed in CASSANDRA-8110 is to achieve the goal of at least enabling SSTable streaming during the upgrade by not upgrading the SSTable version. This can make the cluster to easily rollback to the previous version.Downgrading SSTable is not the primary focus (though Cassandra needs to implement the way to write SSTable in older versions, so it is somewhat related.)I'm preparing the design doc for the change.Also, if I should create a separate ticket from CASSANDRA-8110 for the clarity of the goal of the change, please let me know.On Tue, Feb 21, 2023 at 5:31 AM Benedict <bened...@apache.org> wrote:FWIW I think 8110 is the right approach, even if it isn’t a panacea. We will have to eventually also tackle system schema changes (probably not hard), and may have to think a little carefully about other things, eg with TTLs the format change is only the contract about what values can be present, so we have to make sure the data validity checks are consistent with the format we write. It isn’t as simple as writing an earlier version in this case (unless we permit truncating the TTL, perhaps) On 20 Feb 2023, at 20:24, Benedict <bened...@apache.org> wrote:In a self-organising community, things that aren’t self-policed naturally end up policed in an adhoc manner, and with difficulty. I’m not sure that’s the same as arbitrary enforcement. It seems to me the real issue is nobody noticed this was agreed and/or forgot and didn’t think about it much. But, even without any prior agreement, it’s perfectly reasonable to request that things do not break compatibility if they do not need to, as part of the normal patch integration process.Issues with 3.1->4.0 aren’t particularly relevant as they predate any agreement to do this. But we can and should address the problem of new columns in schema tables, as this happens often between versions. I’m not sure it has in 4.1 though?Regarding downgrade versions, surely this should simply be the same as upgrade versions we support?On 20 Feb 2023, at 20:02, Jeff Jirsa <jji...@gmail.com> wrote:I'm not even c
Re: Downgradability
s using the standard published Cassandra docker images for 3.1 and 4.0 with data mounted on an external drive. two of the containers (one of each version) did not automatically start Cassandra. My process was then:start and stop Cassandra 4.0 to create the default data filesstart the Cassandra 4.0 container that does not automatically run Cassandra and execute the new downgrade functionality.start the Cassandra 3.1 container and dump the logs. If the system started then I knew that I at least had a proof of concept. So far no-go.On Tue, Feb 21, 2023 at 8:57 AM Branimir Lambov <branimir.lam...@datastax.com> wrote:It appears to me that the first thing we need to start this feature off is a definition of a suite of tests together with a set of rules to keep the suite up to date with new features as they are introduced. The moment that suite is in place, we can start having some confidence that we can enforce downgradability.Something like this will definitely catch incompatibilities in SSTable formats (such as the one in CASSANDRA-17698 that I managed to miss during review), but will also be able to identify incompatible system schema changes among others, and at the same time rightfully ignore non-breaking changes such as modifications to the key cache serialization formats.Is downgradability in scope for 5.0? It is a feature like any other, and I don't see any difficulty adding it (with support for downgrade to 4.x) a little later in the 5.x timeline.Regards,BranimirOn Tue, Feb 21, 2023 at 9:40 AM Jacek Lewandowski <lewandowski.ja...@gmail.com> wrote:I'd like to mention CASSANDRA-17056 (CEP-17) here as it aims to introduce multiple sstable formats support. It allows for providing an implementation of SSTableFormat along with SSTableReader and SSTableWriter. That could be extended easily to support different implementations for certain version ranges, like one impl for ma-nz, other for oa+, etc. without having a confusing implementation with a lot of conditional blocks. Old formats in such case could be maintained separately from the main code and easily switched any time. thanks- - -- --- - -Jacek Lewandowskiwt., 21 lut 2023 o 01:46 Yuki Morishita <yu...@apache.org> napisał(a):Hi,What I wanted to address in my comment in CASSANDRA-8110(https://issues.apache.org/jira/browse/CASSANDRA-8110?focusedCommentId=17641705&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17641705) is to focus on better upgrade experience.Upgrading the cluster can be painful for some orgs with mission critical Cassandra cluster, where they cannot tolerate less availability because of the inability to replace the downed node.They also need to plan rolling back to the previous state when something happens along the way.The change I proposed in CASSANDRA-8110 is to achieve the goal of at least enabling SSTable streaming during the upgrade by not upgrading the SSTable version. This can make the cluster to easily rollback to the previous version.Downgrading SSTable is not the primary focus (though Cassandra needs to implement the way to write SSTable in older versions, so it is somewhat related.)I'm preparing the design doc for the change.Also, if I should create a separate ticket from CASSANDRA-8110 for the clarity of the goal of the change, please let me know.On Tue, Feb 21, 2023 at 5:31 AM Benedict <bened...@apache.org> wrote:FWIW I think 8110 is the right approach, even if it isn’t a panacea. We will have to eventually also tackle system schema changes (probably not hard), and may have to think a little carefully about other things, eg with TTLs the format change is only the contract about what values can be present, so we have to make sure the data validity checks are consistent with the format we write. It isn’t as simple as writing an earlier version in this case (unless we permit truncating the TTL, perhaps) On 20 Feb 2023, at 20:24, Benedict <bened...@apache.org> wrote:In a self-organising community, things that aren’t self-policed naturally end up policed in an adhoc manner, and with difficulty. I’m not sure that’s the same as arbitrary enforcement. It seems to me the real issue is nobody noticed this was agreed and/or forgot and didn’t think about it much. But, even without any prior agreement, it’s perfectly reasonable to request that things do not break compatibility if they do not need to, as part of the normal patch integration process.Issues with 3.1->4.0 aren’t particularly relevant as they predate any agreement to do this. But we can and should address the problem of new columns in schema tables, as this happens often between versions. I’m not sure it has in 4.1 though?Regarding downgrade versions, surely this should simply be the same as upgrade versions we support?On 20 Feb 2023, at 20:02, Jeff Jirsa <jji...@gmail.com> wrote:I'm not even convinced even 8110 addresses this - just writing sstables
Re: [DISCUSS] Allow UPDATE on settings virtual table to change running configuration
Could you describe the issues? Config that is globally exposed should ideally be immutable with final members, in which case volatile is only necessary if you’re using the config parameter in a tight loop that you need to witness a new value - which shouldn’t apply to any of our config.There are some weird niches, like updating long values on some (unsupported by us) JVMs that may tear. Technically you also require it for visibility with the JMM. But in practice it is mostly unnecessary. Often what seems to be a volatile issue is really something else.On 22 Feb 2023, at 13:18, Benjamin Lerer wrote:I have seen issues with some updatable parameters which were missing the volatile keyword.Le mer. 22 févr. 2023 à 11:36, Aleksey Yeshchenkoa écrit :FWIW most of those volatile fields, if not in fact all of them, should NOT be volatile at all. Someone started the trend and most folks have been copycatting or doing the same for consistency with the rest of the codebase.Please definitely don’t rely on that.On 21 Feb 2023, at 21:06, Maxim Muzafarov wrote:1. Rely on the volatile keyword in front of fields in the Config class;I would say this is the most confusing option for me because itdoesn't give us all the guarantees we need, and also:- We have no explicit control over what exactly we expose to a user.When we modify the JMX API, we're implementing a new method for theMBean, which in turn makes this action an explicit exposure;- The volatile keyword is not the only way to achieve thread safety,and looks strange for the public API design point;- A good example is the setEnableDropCompactStorage method, whichchanges the volatile field, but is only visible for testing purposes;
Re: Downgradability
Ok I will be honest, I was fairly sure we hadn’t yet broken downgrade - but I was wrong. CASSANDRA-18061 introduced a new column to a system table, which is a breaking change. But that’s it, as far as I can tell. I have run a downgrade test successfully after reverting that ticket, using the one line patch below. This makes every in-jvm upgrade test also a downgrade test. I’m sure somebody more familiar with dtests can readily do the same there.While we look to fix 18061 and enable downgrade tests (and get a clean run of the full suite), can we all agree not to introduce new breaking changes?index e41444fe52..085b25f8af 100644--- a/test/distributed/org/apache/cassandra/distributed/upgrade/UpgradeTestBase.java+++ b/test/distributed/org/apache/cassandra/distributed/upgrade/UpgradeTestBase.java@@ -104,6 +104,7 @@ public class UpgradeTestBase extends DistributedTestBase .addEdge(v40, v41) .addEdge(v40, v42) .addEdge(v41, v42)+ .addEdge(v42, v41) .build();On 22 Feb 2023, at 15:08, Jeff Jirsa wrote:When people are serious about this requirement, they’ll build the downgrade equivalents of the upgrade tests and run them automatically, often, so people understand what the real gap is and when something new makes it break Until those tests exist, I think collectively we should all stop pretending like this is dogma. Best effort is best effort. On Feb 22, 2023, at 6:57 AM, Branimir Lambov wrote:> 1. Major SSTable changes should begin with forward-compatibility in a prior release.This requires "feature" changes, i.e. new non-trivial code for previous patch releases. It also entails porting over any further format modification.Instead of this, in combination with your second point, why not implement backwards write compatibility? The opt-in is then clearer to define (i.e. upgrades start with e.g. a "4.1-compatible" settings set that includes file format compatibility and disabling of new features, new nodes start with "current" settings set). When the upgrade completes and the user is happy with the result, the settings set can be replaced.Doesn't this achieve what you want (and we all agree is a worthy goal) with much less effort for everyone? Supporting backwards-compatible writing is trivial, and we even have a proof-of-concept in the stats metadata serializer. It also simplifies by a serious margin the amount of work and thinking one has to do when a format improvement is implemented -- e.g. the TTL patch can just address this in exactly the way the problem was addressed in earlier versions of the format, by capping to 2038, without any need to specify, obey or test any configuration flags.>> It’s a commitment, and it requires every contributor to consider it as part of work they produce.> But it shouldn't be a burden. Ability to downgrade is a testable problem, so I see this work as a function of the suite of tests the project is willing to agree on supporting.I fully agree with this sentiment, and I feel that the current "try to not introduce breaking changes" approach is adding the burden, but not the benefits -- because the latter cannot be proven, and are most likely already broken.Regards,BranimirOn Wed, Feb 22, 2023 at 1:01 AM Abe Ratnofskywrote:Some interesting existing work on this subject is "Understanding and Detecting Software Upgrade Failures in Distributed Systems" - https://dl.acm.org/doi/10.1145/3477132.3483577, also summarized by Andrey Satarin here: https://asatarin.github.io/talks/2022-09-upgrade-failures-in-distributed-systems/They specifically tested Cassandra upgrades, and have a solid list of defects that they found. They also describe their testing mechanism DUPTester, which includes a component that confirms that the leftover state from one version can start up on the next version. There is a wider scope of upgrade defects highlighted in the paper, beyond SSTable version support.I believe the project would benefit from expanding our test suite similarly, by parametrizing more tests on upgrade version pairs.Also, per Benedict's comment:> It’s a commitment, and it requires every contributor to consider it as part of work they produce.But it shouldn't be a burden. Ability to downgrade is a testable problem, so I see this work as a function of the suite of tests the project is willing to agree on supporting.Specifically - I agree with Scott's proposal to emulate the HDFS upgrade-then-finalize approach. I would also support automatic finalization based on a time threshold or similar, to balance the priorities of safe and straightforward upgrades. Users need to be aware of the range of SSTable formats supported by a given v
Re: Downgradability
Those tickets mostly do not need to break compatibility, and it is pretty easy for them to avoid doing so without any additional facilities.Only the TTL ticket has an excuse, as it debatably needs to support a higher version under certain non-default config settings. However there are no serialisation changes, so even here this is only a matter of selecting the version we write to the descriptor and no other changes at all.Can somebody explain to me what is so burdensome, that we seem to be spending longer debating it than it would take to implement the necessary changes?On 22 Feb 2023, at 21:23, Jeremiah D Jordan wrote:We have multiple tickets about to merge that introduce new on disk format changes. I see no reason to block those indefinitely while we figure out how to do the on disk format downgrade stuff.-JeremiahOn Feb 22, 2023, at 3:12 PM, Benedict wrote:Ok I will be honest, I was fairly sure we hadn’t yet broken downgrade - but I was wrong. CASSANDRA-18061 introduced a new column to a system table, which is a breaking change. But that’s it, as far as I can tell. I have run a downgrade test successfully after reverting that ticket, using the one line patch below. This makes every in-jvm upgrade test also a downgrade test. I’m sure somebody more familiar with dtests can readily do the same there.While we look to fix 18061 and enable downgrade tests (and get a clean run of the full suite), can we all agree not to introduce new breaking changes?index e41444fe52..085b25f8af 100644--- a/test/distributed/org/apache/cassandra/distributed/upgrade/UpgradeTestBase.java+++ b/test/distributed/org/apache/cassandra/distributed/upgrade/UpgradeTestBase.java@@ -104,6 +104,7 @@ public class UpgradeTestBase extends DistributedTestBase .addEdge(v40, v41) .addEdge(v40, v42) .addEdge(v41, v42)+ .addEdge(v42, v41) .build();On 22 Feb 2023, at 15:08, Jeff Jirsa wrote:When people are serious about this requirement, they’ll build the downgrade equivalents of the upgrade tests and run them automatically, often, so people understand what the real gap is and when something new makes it break Until those tests exist, I think collectively we should all stop pretending like this is dogma. Best effort is best effort. On Feb 22, 2023, at 6:57 AM, Branimir Lambov wrote:> 1. Major SSTable changes should begin with forward-compatibility in a prior release.This requires "feature" changes, i.e. new non-trivial code for previous patch releases. It also entails porting over any further format modification.Instead of this, in combination with your second point, why not implement backwards write compatibility? The opt-in is then clearer to define (i.e. upgrades start with e.g. a "4.1-compatible" settings set that includes file format compatibility and disabling of new features, new nodes start with "current" settings set). When the upgrade completes and the user is happy with the result, the settings set can be replaced.Doesn't this achieve what you want (and we all agree is a worthy goal) with much less effort for everyone? Supporting backwards-compatible writing is trivial, and we even have a proof-of-concept in the stats metadata serializer. It also simplifies by a serious margin the amount of work and thinking one has to do when a format improvement is implemented -- e.g. the TTL patch can just address this in exactly the way the problem was addressed in earlier versions of the format, by capping to 2038, without any need to specify, obey or test any configuration flags.>> It’s a commitment, and it requires every contributor to consider it as part of work they produce.> But it shouldn't be a burden. Ability to downgrade is a testable problem, so I see this work as a function of the suite of tests the project is willing to agree on supporting.I fully agree with this sentiment, and I feel that the current "try to not introduce breaking changes" approach is adding the burden, but not the benefits -- because the latter cannot be proven, and are most likely already broken.Regards,BranimirOn Wed, Feb 22, 2023 at 1:01 AM Abe Ratnofsky <a...@aber.io> wrote:Some interesting existing work on this subject is "Understanding and Detecting Software Upgrade Failures in Distributed Systems" - https://dl.acm.org/doi/10.1145/3477132.3483577, also summarized by Andrey Satarin here: https://asatarin.github.io/talks/2022-09-upgrade-failures-in-distributed-systems/They specifically tested Cassandra upgrades, and have a solid list of defects that they found. They also describe their testing mechanism DUPTester, which i
Re: Downgradability
Forget downgradeability for a moment: we should not be breaking format compatibility without good reason. Bumping a major version isn’t enough of a reason. Can somebody explain to me why this is being fought tooth and nail, when the work involved is absolutely minimal?Regarding tests: what more do you want, than running our upgrade suite backwards?On 23 Feb 2023, at 09:45, Benjamin Lerer wrote: Can somebody explain to me what is so burdensome, that we seem to be spending longer debating it than it would take to implement the necessary changes? I believe that we all agree on the outcome. Everybody wants downgradability. The issue is on the path to get there.As far as I am concerned, I would like to see a proper solution and as Jeff suggested the equivalent of the upgrade tests as gatekeepers. Having everybody trying to enforce it on his own way will only lead to a poor result in my opinion with some addoc code that might not really guarantee real downgradability in the end.We have rushed in the past to get feature outs and pay the price for it. I simply prefer that we take the time to do things right.Thanks to Scott and you, downgradability got a much better visibility so no matter what approach we pick, I am convinced that we will get there.Le jeu. 23 févr. 2023 à 09:49, Claude Warren, Jr via deva écrit :Broken downgrading can be fixed (I think) by modifying the SearializationHeader.toHeader() method where it currently throws an UnknownColumnException. If we can, instead of throwing the exception, create a dropped column for the unexpected column then I think the code will work.I realise that to do this in the wild is not possible as it is a change to released code, but we could handle it going forward.On Wed, Feb 22, 2023 at 11:21 PM Henrik Ingo wrote:... ok apparently shift+enter sends messages now?I was just saying if at least the file format AND system/tables - anything written to disk - can be protected with a switch, then it allows for quick downgrade by shutting down the entire cluster and restarting with the downgraded binary. It's a start.To be able to do that live in a distributed system needs to consider much more: gossip, streaming, drivers, and ultimately all features, because we don't' want an application developer to use a shiny new thing that a) may not be available on all nodes, or b) may disappear if the cluster has to be downgraded later.henrikOn Thu, Feb 23, 2023 at 1:14 AM Henrik Ingo wrote:Just this once I'm going to be really brief :-)Just wanted to share for reference how Mongodb implemented downgradeability around their 4.4 version: https://www.mongodb.com/docs/manual/release-notes/6.0-downgrade-sharded-cluster/Jeff you're right. Ultimately this is about more than file formats. However, ideally if at least theOn Mon, Feb 20, 2023 at 10:02 PM Jeff Jirsa wrote:I'm not even convinced even 8110 addresses this - just writing sstables in old versions won't help if we ever add things like new types or new types of collections without other control abilities. Claude's other email in another thread a few hours ago talks about some of these surprises - "Specifically during the 3.1 -> 4.0 changes a column broadcast_port was added to system/local. This means that 3.1 system can not read the table as it has no definition for it. I tried marking the column for deletion in the metadata and in the serialization header. The later got past the column not found problem, but I suspect that it just means that data columns after broadcast_port shifted and so incorrectly read." - this is a harder problem to solve than just versioning sstables and network protocols. Stepping back a bit, we have downgrade ability listed as a goal, but it's not (as far as I can tell) universally enforced, nor is it clear at which point we will be able to concretely say "this release can be downgraded to X". Until we actually define and agree that this is a real goal with a concrete version where downgrade-ability becomes real, it feels like things are somewhat arbitrarily enforced, which is probably very frustrating for people trying to commit work/tickets.- JeffOn Mon, Feb 20, 2023 at 11:48 AM Dinesh Joshi wrote:I’m a big fan of maintaining backward compatibility. Downgradability implies that we could potentially roll back an upgrade at any time. While I don’t think we need to retain the ability to downgrade in perpetuity it would be a good objective to maintain strict backward compatibility and therefore downgradability until a certain point. This would imply versioning metadata and extending it in such a way that prior version(s) could continue functioning. This can certainly be expensive to implement and might bloat on-disk storage. However, we could always offer an option for the operator to optimize the on-disk structures for the current version then we can rewrite them in the latest ver
Re: Downgradability
I don’t think there’s anything about a new format that requires a version bump, but I could be missing something.We have to have a switch to enable tries already don’t we? I’m pretty sure we haven’t talked about switching the default format?On 23 Feb 2023, at 12:12, Henrik Ingo wrote:On Thu, Feb 23, 2023 at 11:57 AM Benedict <bened...@apache.org> wrote:Can somebody explain to me why this is being fought tooth and nail, when the work involved is absolutely minimal?I don't know how each individual has been thinking about this, but it seems to me just looking at all the tasks that at least the introduction of tries is a major format change anyway - since it's the whole point - and therefore people working on other tasks may have assumed the format is changing anyway and therefore something like a switch (is this what is referred to as the C-8110 solution?) will take care of it for everyone.I'm not sure there's consensus that such a switch is a sufficient resolution to this discussion, but if there were such a consensus, the next question would be whether the patches that are otherwise ready now can merge, or whether they will all be blocked waiting for the compatibility solution first. And possibly better testing, etc. Letting them merge would be justified by the desire to have more frequent and smaller increments of work merged into trunk... well, I'm not going to repeat everything from that discussion but the same pro's and con's apply.henrik -- Henrik Ingoc. +358 40 569 7354 w. www.datastax.com
Re: Downgradability
Either way, it feels like this has become much more of a big deal than it needed to.I would prefer the pending patches to avoid breaking compatibility, as I think they can do it easily. But, if we agree to block release until we can double back to fix it with versioned writing (which I agree with Jacek are LHF - I think we literally just need a method that chooses the descriptor version) then let’s not further agonise over this.Alternatively I’d be happy to work with the authors to get this done alongside their work, as I don’t think it would hold anything up. We just need something to pick a descriptor besides latest on write, everything else is basically there for these patches.On 23 Feb 2023, at 16:37, Henrik Ingo wrote:Right. So I'm speculating everyone else who worked on a patch that breaks compatibility has been working under the mindset "I'll just put this behind the same switch". Or something more vague / even less correct, such as assuming that tries would become the default immediately.At least in my mind when we talk about the "switch to enable tries" I do also consider things like "don't break streaming". So I guess whether one considers "a switch" to exist already or not, might be subjective in this case, because people have different assumptions on the definition of done of such a switch.henrikOn Thu, Feb 23, 2023 at 2:53 PM Benedict <bened...@apache.org> wrote:I don’t think there’s anything about a new format that requires a version bump, but I could be missing something.We have to have a switch to enable tries already don’t we? I’m pretty sure we haven’t talked about switching the default format?On 23 Feb 2023, at 12:12, Henrik Ingo <henrik.i...@datastax.com> wrote:On Thu, Feb 23, 2023 at 11:57 AM Benedict <bened...@apache.org> wrote:Can somebody explain to me why this is being fought tooth and nail, when the work involved is absolutely minimal?I don't know how each individual has been thinking about this, but it seems to me just looking at all the tasks that at least the introduction of tries is a major format change anyway - since it's the whole point - and therefore people working on other tasks may have assumed the format is changing anyway and therefore something like a switch (is this what is referred to as the C-8110 solution?) will take care of it for everyone.I'm not sure there's consensus that such a switch is a sufficient resolution to this discussion, but if there were such a consensus, the next question would be whether the patches that are otherwise ready now can merge, or whether they will all be blocked waiting for the compatibility solution first. And possibly better testing, etc. Letting them merge would be justified by the desire to have more frequent and smaller increments of work merged into trunk... well, I'm not going to repeat everything from that discussion but the same pro's and con's apply.henrik -- Henrik Ingoc. +358 40 569 7354 w. www.datastax.com -- Henrik Ingoc. +358 40 569 7354 w. www.datastax.com
Re: [DISCUSS] Next release date
I agree, we shouldn’t be branching annually, we should be releasing annually - and we shouldn’t assume that will take six months. We should be aiming for 1-2mo and so long as our trajectory towards that is good, I don’t think there’s anything to worry about (and we’ll get our first comparative data point this release)On 28 Feb 2023, at 16:02, Ekaterina Dimitrova wrote:By “not to have to delay again” I mean GA, not the agreement for when to branch and start feature freeze. Just to be clear as I realized I might be misunderstood :-) On Tue, 28 Feb 2023 at 10:47, Ekaterina Dimitrovawrote:I agree that November-May falls too short so +1 on postponing the day on my endNow I think Mick brought excellent point - let’s get into separate discussion about the root cause of releasing 4.1 only in December and see what we need to change/improve on so we do not get into future delays and we do not sacrifice QA of course. “Postponing" suggests a one-off move, but I'm presuming this would be a permanent change?“I’d say move it, get to the bottom of how not to have to delay again or at least reduce the window and commitOn Tue, 28 Feb 2023 at 9:38, Mick Semb Wever wrote:We forked the 4.0 and 4.1 branches beginning of May. Unfortunately, for 4.1 we were only able to release GA in December which impacted how much time we could spend focussing on the next release and the progress that we could do. By consequence, I am wondering if it makes sense for us to branch 5.0 in May or if we should postpone that date.What is your opinion?My initial preference is to stick with the May branch and its initial alpha/beta release. Giving in to the delays doesn't improve the causes of them.We should focus on why it took 6 months to go from 4.1 first alpha to GA and what happened inside that time window. I'm not convinced summer holidays can be to blame for. I think a lack of QA/CI and folk dedicating time to get it to GA is the bigger problem. On the QA/CI front I believe we have made significant improvements already. And we saw less releases of 4.1 before its GA. I also think reducing the focus and scope of the subsequent release cycle is a cost that creates the correct incentive, so long as we share the burden of the stabilising_to_GA journey. While it might be difficult for folk to commit their time over summer holidays, the three months of May-July should be way more than enough if we are serious about it.My thoughts don't touch on CEPs inflight. But my feeling is this should not be about what we want to "squeeze in" (which only makes the problem worse), rather whether the folk that are offering their time to stabilise to GA have a preference for May-July or their September-November."Postponing" suggests a one-off move, but I'm presuming this would be a permanent change?
Re: [DISCUSS] Allow UPDATE on settings virtual table to change running configuration
Another option would be to introduce a second class with the same fields as the first where we simply specify final for immutable fields, and construct it after parsing the Config.We could even generate the non-final version from the one with final fields.Not sure this would be nicer, but it is an alternative.On 1 Mar 2023, at 02:10, Ekaterina Dimitrova wrote:I agree with David that the annotations seem a bit too many but if I have to choose from the three approaches - the annotations one seems most reasonable to me and I didn’t have the chance to consider any others. Volatile seems fragile and unclear as a differentiator. I agreeOn Tue, 28 Feb 2023 at 17:47, Maxim Muzafarov <mmu...@apache.org> wrote:Folks, If there are no objections to the approach described in this thread, I'd like to proceed with this change. The change seems to be valuable for the upcoming release, so any comments are really appreciated. On Wed, 22 Feb 2023 at 21:51, David Capwell <dcapw...@apple.com> wrote: > > I guess back to the point of the thread, we need a way to know what configs are mutable for the settings virtual table, so need some way to denote that the config replica_filtering_protection.cached_rows_fail_threshold is mutable. Given the way that the yaml config works, we can’t rely on the presences of “final” or not, so need some way to mark a config is mutable for that table, does anyone want to offer feedback on what works best for them? > > Out of all proposals given so far “volatile” is the least verbose but also not explicit (as this thread is showing there is debate on if this should be present), new annotations are a little more verbose but would be explicit (no surprises), and getter/setters in different classes (such as DD) is the most verbose and suffers from not being explicit and ambiguity for mapping back to Config. > > Given the above, annotations sounds like the best option, but do we really want our config to look as follows? > > @Replaces(oldName = "native_transport_idle_timeout_in_ms", converter = Converters.MILLIS_DURATION_LONG, deprecated = true) > @Mutable > public DurationSpec.LongMillisecondsBound native_transport_idle_timeout = new DurationSpec.LongMillisecondsBound("0ms”); > @Mutable > public DurationSpec.LongMillisecondsBound transaction_timeout = new DurationSpec.LongMillisecondsBound("30s”); > @Mutable > public double phi_convict_threshold = 8.0; > public String partitioner; // assume immutable by default? > > > > On Feb 22, 2023, at 6:20 AM, Benedict <bened...@apache.org> wrote: > > > > Could you describe the issues? Config that is globally exposed should ideally be immutable with final members, in which case volatile is only necessary if you’re using the config parameter in a tight loop that you need to witness a new value - which shouldn’t apply to any of our config. > > > > There are some weird niches, like updating long values on some (unsupported by us) JVMs that may tear. Technically you also require it for visibility with the JMM. But in practice it is mostly unnecessary. Often what seems to be a volatile issue is really something else. > > > >> On 22 Feb 2023, at 13:18, Benjamin Lerer <b.le...@gmail.com> wrote: > >> > >> I have seen issues with some updatable parameters which were missing the volatile keyword. > >> > >> Le mer. 22 févr. 2023 à 11:36, Aleksey Yeshchenko <alek...@apple.com> a écrit : > >> FWIW most of those volatile fields, if not in fact all of them, should NOT be volatile at all. Someone started the trend and most folks have been copycatting or doing the same for consistency with the rest of the codebase. > >> > >> Please definitely don’t rely on that. > >> > >>> On 21 Feb 2023, at 21:06, Maxim Muzafarov <mmu...@apache.org> wrote: > >>> > >>> 1. Rely on the volatile keyword in front of fields in the Config class; > >>> > >>> I would say this is the most confusing option for me because it > >>> doesn't give us all the guarantees we need, and also: > >>> - We have no explicit control over what exactly we expose to a user. > >>> When we modify the JMX API, we're implementing a new method for the > >>> MBean, which in turn makes this action an explicit exposure; > >>> - The volatile keyword is not the only way to achieve thread safety, > >>> and looks strange for the public API design point; > >>> - A good example is the setEnableDropCompactStorage method, which > >>> changes the volatile field, but is only visible for testing purposes; > >> > >> >
Re: [DISCUSS] Next release date
It doesn’t look like we agreed to a policy of annual branch dates, only annual releases and that we would schedule this for 4.1 based on 4.0’s branch date. Given this was the reasoning proposed I can see why folk would expect this would happen for the next release. I don’t think there was a strong enough commitment here to be bound by, it if we think different maths would work better. I recall the goal for an annual cadence was to ensure we don’t have lengthy periods between releases like 3.x to 4.0, and to try to reduce the pressure certain contributors might feel to hit a specific release with a given feature. I think it’s better to revisit these underlying reasons and check how they apply than to pick a mechanism and stick to it too closely. The last release was quite recent, so we aren’t at risk of slow releases here. Similarly, there are some features that the *project* would probably benefit from landing prior to release, if this doesn’t push release back too far. > On 1 Mar 2023, at 13:38, Mick Semb Wever wrote: > > >> My thoughts don't touch on CEPs inflight. > > > > For the sake of broadening the discussion, additional questions I think > worthwhile to raise are… > > 1. What third parties, or other initiatives, are invested and/or working > against the May deadline? and what are their views on changing it? > 1a. If we push branching back to September, how confident are we that we'll > get to GA before the December Summit? > 2. What CEPs look like not landing by May that we consider a must-have this > year? > 2a. Is it just tail-end commits in those CEPs that won't make it? Can these > land (with or without a waiver) during the alpha phase? > 2b. If the final components to specified CEPs are not approved/appropriate > to land during alpha, would it be better if the project commits to a one-off > half-year release later in the year?
Re: Degradation of availability when using NTS and RF > number of racks
My view is that if this is a pretty serious bug. I wonder if transactional metadata will make it possible to safely fix this for users without rebuilding (only via opt-in, of course). > On 7 Mar 2023, at 15:54, Miklosovic, Stefan > wrote: > > Thanks everybody for the feedback. > > I think that emitting a warning upon keyspace creation (and alteration) > should be enough for starters. If somebody can not live without 100% bullet > proof solution over time we might choose some approach from the offered ones. > As the saying goes there is no silver bullet. If we decide to implement that > new strategy, we would probably emit warnings anyway on NTS but it would be > already done so just new strategy would be provided. > > > From: Paulo Motta > Sent: Monday, March 6, 2023 17:48 > To: dev@cassandra.apache.org > Subject: Re: Degradation of availability when using NTS and RF > number of > racks > > NetApp Security WARNING: This is an external email. Do not click links or > open attachments unless you recognize the sender and know the content is safe. > > > > It's a bit unfortunate that NTS does not maintain the ability to lose a rack > without loss of quorum for RF > #racks > 2, since this can be easily achieved > by evenly placing replicas across all racks. > > Since RackAwareTopologyStrategy is a superset of NetworkTopologyStrategy, > can't we just use the new correct placement logic for newly created keyspaces > instead of having a new strategy? > > The placement logic would be backwards-compatible for RF <= #racks. On > upgrade, we could mark existing keyspaces with RF > #racks with > use_legacy_replica_placement=true to maintain backwards compatibility and log > a warning that the rack loss guarantee is not maintained for keyspaces > created before the fix. Old keyspaces with RF <=#racks would still work with > the new replica placement. The downside is that we would need to keep the old > NTS logic around, or we could eventually deprecate it and require users to > migrate keyspaces using the legacy placement strategy. > > Alternatively we could have RackAwareTopologyStrategy and fail NTS keyspace > creation for RF > #racks and indicate users to use RackAwareTopologyStrategy > to maintain the quorum guarantee on rack loss or set an override flag > "support_quorum_on_rack_loss=false". This feels a bit iffy though since it > could potentially confuse users about when to use each strategy. > > On Mon, Mar 6, 2023 at 5:51 AM Miklosovic, Stefan > mailto:stefan.mikloso...@netapp.com>> wrote: > Hi all, > > some time ago we identified an issue with NetworkTopologyStrategy. The > problem is that when RF > number of racks, it may happen that NTS places > replicas in such a way that when whole rack is lost, we lose QUORUM and data > are not available anymore if QUORUM CL is used. > > To illustrate this problem, lets have this setup: > > 9 nodes in 1 DC, 3 racks, 3 nodes per rack. RF = 5. Then, NTS could place > replicas like this: 3 replicas in rack1, 1 replica in rack2, 1 replica in > rack3. Hence, when rack1 is lost, we do not have QUORUM. > > It seems to us that there is already some logic around this scenario (1) but > the implementation is not entirely correct. This solution is not computing > the replica placement correctly so the above problem would be addressed. > > We created a draft here (2, 3) which fixes it. > > There is also a test which simulates this scenario. When I assign 256 tokens > to each node randomly (by same mean as generatetokens command uses) and I try > to compute natural replicas for 1 billion random tokens and I compute how > many cases there will be when 3 replicas out of 5 are inserted in the same > rack (so by losing it we would lose quorum), for above setup I get around 6%. > > For 12 nodes, 3 racks, 4 nodes per rack, rf = 5, this happens in 10% cases. > > To interpret this number, it basically means that with such topology, RF and > CL, when a random rack fails completely, when doing a random read, there is > 6% chance that data will not be available (or 10%, respectively). > > One caveat here is that NTS is not compatible with this new strategy anymore > because it will place replicas differently. So I guess that fixing this in > NTS will not be possible because of upgrades. I think people would need to > setup completely new keyspace and somehow migrate data if they wish or they > just start from scratch with this strategy. > > Questions: > > 1) do you think this is meaningful to fix and it might end up in trunk? > > 2) should not we just ban this scenario entirely? It might be possible to > check the configuration upon keyspace creation (rf > num of racks) and if we > see this is problematic we would just fail that query? Guardrail maybe? > > 3) people in the ticket mention writing "CEP" for this but I do not see any > reason to do so. It is just a strategy as any other.
Re: hsqldb test dependency in simulator code
I’m sure we can use a different hash map there. > On 14 Mar 2023, at 11:49, Miklosovic, Stefan > wrote: > > Hi list, > > while removing Hadoop code in trunk, as agreed on ML recently, I did that but > we need to do this (1). By removing all Hadoop dependencies, we also removed > hsqldb library we surprisingly rely on so I am putting it back. > > Honestly, I do not know if people are even aware of the fact we are depending > on hsqldb in simulator package. Can not we rewrite that so we can drop this > dependency entirely? > > Do we want to spend time on rewriting this and then we remove Hadoop > libraries or I just add this library among the test ones? > > Thanks > > (1) https://github.com/apache/cassandra/pull/2212/files#r1133841928
Re: [DISCUSS] cep-15-accord, cep-21-tcm, and trunk
Accord doesn’t have a hard dependency on CEP-21 fwiw, it just needs linearizable epochs. This could be achieved with a much more modest patch, essentially avoiding almost all of the insertion points of cep-21, just making sure that joining and leaving nodes update some state via Paxos instead of via gossip, and assign an epoch as part of the update.It would be preferable to use cep-21 since it introduces this functionality, and our intention is to use cep-21 for this. But it isn’t a hard dependency.On 22 Mar 2023, at 20:58, Henrik Ingo wrote:Since Accord depends on transactional meta-data... is there really any alternative than what you propose?Sure, if there is some subset of Accord that could be merged, while work continues on a branched that is based on CEP-21 branch, that would be great. Merging even a prototype of Accord to trunk probably has marketing value. (Don't laugh, many popular databases have had "atomic transactions, except if anyone executes DDL simultaneously".)On Tue, Mar 14, 2023 at 8:39 PM Caleb Rackliffewrote:We've already talked a bit about how and when the current Accord feature branch should merge to trunk. Earlier today, the cep-21-tcm branch was created to house ongoing work on Transactional Metadata.Returning to CASSANDRA-18196 (merging Accord to trunk) after working on some other issues, I want to propose changing direction slightly, and make sure this makes sense to everyone else.1.) There are a few minor Accord test issues in progress that I'd like to wrap up before doing anything, but those shouldn't take long. (See CASSANDRA-18302 and CASSANDRA-18320.)2.) Accord logically depends on Transactional Metadata.3.) The new cep-21-tcm branch is going to have to be rebased to trunk on a regular basis.So...4.) I would like to pause merging cep-15-accord to trunk, and instead rebase it on cep-21-tcm until such time as the latter merges to trunk, at which point cep-15-accord can be rebased to trunk again and then merged when ready, nominally taking up the work of CASSANDRA-18196 again.Any objections to this? -- Henrik Ingoc. +358 40 569 7354 w. www.datastax.com
Re: [DISCUSS] cep-15-accord, cep-21-tcm, and trunk
I agree, decoupling seems better to me. The fewer inter-effort dependencies, the less foot-treading.As Henrik mentioned, we don’t need a feature complete cep-15 to merge, as its integration is quite lightweight. We can leave some stuff to provide later, and depending on conditions at release time make a decision as to whether we release with cep-21, or with a minimal epoch facility (which would IMO not being a huge endeavour), or with it disabled altogether.On 24 Mar 2023, at 19:13, Josh McKenzie wrote:FWIW, I'd still rather just integrate w/ TCM ASAP, avoiding integration risk while accepting the possible delivery risk.What does the chain of rebases against trunk look like here? cep-21-tcm rebase, then cep-15 on cep-21-tcm, then cep-7 on cep-21-tcm, then a race on whichever of 15 or 7 merge after 21 goes into trunk? Or 7 based on 15, the other way around...I'm not actively working on any of these branches so take my perspective with that appropriate grain of salt, but the coupling of these things seems to have it's own kind of breed of integration pain to upkeep over time depending on how frequently we're rebasing against trunk.the question we want to answer is whether or not we build a throwaway patch for linearizable epochsDo we have an informed opinion on how long we think this would take? Seems like that'd help clarify whether or not there's contributors with the bandwidth and desire to even do that or whether everyone depending on cep-21 is our option.On Fri, Mar 24, 2023, at 1:30 PM, Caleb Rackliffe wrote:I actually did a dry run rebase of cep-15-accord on top of cep-21-tcm here: https://github.com/apache/cassandra/pull/2227It wasn't too terrible, and I was actually able to get the main CQL-based Accord tests working as long as I disabled automatic forwarding of CAS and SERIAL read operations to Accord. The bigger issue was test stability in cep-21-tcm. I'm sure that will mature quickly here, and I created a few issues to attach to the Transactional Metadata epic.In the meantime, I rebased cep-15-accord on trunk at commit 3eb605b4db0fa6b1ab67b85724a9cfbf00aae7de. The option to finish the remaining bits of CASSANDRA-18196 and merge w/o TCM is still available, but it sounds like the question we want to answer is whether or not we build a throwaway patch for linearizable epochs in lieu of TCM?FWIW, I'd still rather just integrate w/ TCM ASAP, avoiding integration risk while accepting the possible delivery risk.On Fri, Mar 24, 2023 at 9:32 AM Josh McKenzie <jmcken...@apache.org> wrote:making sure that joining and leaving nodes update some state via Paxos instead of via gossipWhat kind of a time delivery risk does coupling CEP-15 with CEP-21 introduce (i.e. unk-unk on CEP-21 leading to delay cascades to CEP-15)? Seems like having a table we CAS state for on epochs wouldn't be too challenging, but I'm not familiar w/the details so I could be completely off here.Being able to deliver both of these things on their own timetable seems like a pretty valuable thing assuming the lift required would be modest.On Fri, Mar 24, 2023, at 6:15 AM, Benedict wrote:Accord doesn’t have a hard dependency on CEP-21 fwiw, it just needs linearizable epochs. This could be achieved with a much more modest patch, essentially avoiding almost all of the insertion points of cep-21, just making sure that joining and leaving nodes update some state via Paxos instead of via gossip, and assign an epoch as part of the update.It would be preferable to use cep-21 since it introduces this functionality, and our intention is to use cep-21 for this. But it isn’t a hard dependency.On 22 Mar 2023, at 20:58, Henrik Ingo <henrik.i...@datastax.com> wrote:Since Accord depends on transactional meta-data... is there really any alternative than what you propose?Sure, if there is some subset of Accord that could be merged, while work continues on a branched that is based on CEP-21 branch, that would be great. Merging even a prototype of Accord to trunk probably has marketing value. (Don't laugh, many popular databases have had "atomic transactions, except if anyone executes DDL simultaneously".)On Tue, Mar 14, 2023 at 8:39 PM Caleb Rackliffe <calebrackli...@gmail.com> wrote:We've already talked a bit about how and when the current Accord feature branch should merge to trunk. Earlier today, the cep-21-tcm branch was created to house ongoing work on Transactional Metadata.Returning to CASSANDRA-18196 (merging Accord to trunk) after working on some other issues, I want to propose changing direction slightly, and make sure this makes sense to everyone else.1.) There are a few minor Accord test issues in progress that I'd like to wrap up before doing anything, but those shouldn't take long. (See CASSANDRA-18302 and CASSANDRA-18320.)2.) Accord logically depends on Transactional Metadata.3.) The new cep-21-tcm branch is going to have to be rebased
Re: [DISCUSS] cep-15-accord, cep-21-tcm, and trunk
It’s not even clear such an effort would need to be different from that used by cep-21. The point is that there’s not much point litigating this now when we can keep our options open and better decouple the projects, since I don’t think we lose much this way.On 24 Mar 2023, at 19:58, David Capwell wrote:Assuming we do not release with it, then yes, as we wouldn’t need to maintain. My point for this case was that I don’t feel the time cost is worth it, I am not -1 if someone wants to add, was more saying our time is better off else where.We currently don’t touch Transactional Metadata, we have custom logic (which relies on client to tell every instance in the cluster an update happened) that we are using right now in tests and deployed clusters. So once we can integrate with Transactional Metadata, we stop relying on clients to tell us about topology changes… a custom/thrown away epoch generator would make this current process more user friendly, but would need to spend time to make sure stable when deployed on a clusterOn Mar 24, 2023, at 12:43 PM, Josh McKenzie wrote:If this is in a release, we then need to maintain that feature, so would be against it.Isn't the argument that cep-21 provides this so we could just remove the temporary impl and point to the new facility for this generation?On Fri, Mar 24, 2023, at 3:22 PM, David Capwell wrote:the question we want to answer is whether or not we build a throwaway patch for linearizable epochsIf this is in a release, we then need to maintain that feature, so would be against it.If this is for testing, then I would argue the current world is “fine”… current world is hard to use and brittle (users need to tell accord that the cluster changed), but if accord is rebasing on txn metadata then this won’t be that way long (currently blocked from doing that due to txn metadata not passing all tests yet).On Mar 24, 2023, at 12:12 PM, Josh McKenzie wrote:FWIW, I'd still rather just integrate w/ TCM ASAP, avoiding integration risk while accepting the possible delivery risk.What does the chain of rebases against trunk look like here? cep-21-tcm rebase, then cep-15 on cep-21-tcm, then cep-7 on cep-21-tcm, then a race on whichever of 15 or 7 merge after 21 goes into trunk? Or 7 based on 15, the other way around...I'm not actively working on any of these branches so take my perspective with that appropriate grain of salt, but the coupling of these things seems to have it's own kind of breed of integration pain to upkeep over time depending on how frequently we're rebasing against trunk.the question we want to answer is whether or not we build a throwaway patch for linearizable epochsDo we have an informed opinion on how long we think this would take? Seems like that'd help clarify whether or not there's contributors with the bandwidth and desire to even do that or whether everyone depending on cep-21 is our option.On Fri, Mar 24, 2023, at 1:30 PM, Caleb Rackliffe wrote:I actually did a dry run rebase of cep-15-accord on top of cep-21-tcm here: https://github.com/apache/cassandra/pull/2227It wasn't too terrible, and I was actually able to get the main CQL-based Accord tests working as long as I disabled automatic forwarding of CAS and SERIAL read operations to Accord. The bigger issue was test stability in cep-21-tcm. I'm sure that will mature quickly here, and I created a few issues to attach to the Transactional Metadata epic.In the meantime, I rebased cep-15-accord on trunk at commit 3eb605b4db0fa6b1ab67b85724a9cfbf00aae7de. The option to finish the remaining bits of CASSANDRA-18196 and merge w/o TCM is still available, but it sounds like the question we want to answer is whether or not we build a throwaway patch for linearizable epochs in lieu of TCM?FWIW, I'd still rather just integrate w/ TCM ASAP, avoiding integration risk while accepting the possible delivery risk.On Fri, Mar 24, 2023 at 9:32 AM Josh McKenzie <jmcken...@apache.org> wrote:making sure that joining and leaving nodes update some state via Paxos instead of via gossipWhat kind of a time delivery risk does coupling CEP-15 with CEP-21 introduce (i.e. unk-unk on CEP-21 leading to delay cascades to CEP-15)? Seems like having a table we CAS state for on epochs wouldn't be too challenging, but I'm not familiar w/the details so I could be completely off here.Being able to deliver both of these things on their own timetable seems like a pretty valuable thing assuming the lift required would be modest.On Fri, Mar 24, 2023, at 6:15 AM, Benedict wrote:Accord doesn’t have a hard dependency on CEP-21 fwiw, it just needs linearizable epochs. This could be achieved with a much more modest patch, essentially avoiding almost all of the insertion points of cep-21, just making sure that joining and leaving nodes update some state via Paxos instead of via gossip, and assign an epoch as part of the update.It would be preferable to use cep-21 since it int
Re: [DISCUSS] cep-15-accord, cep-21-tcm, and trunk
I don’t know that TCM is a greater source of uncertainty. There is a degree of uncertainty about that :)I just think it’s better not to compound uncertainties, at least while it is not costly to avoid it.On 27 Mar 2023, at 08:27, Henrik Ingo wrote:Seems like this thread is the more appropriate one for Accord/TCM discussionIMO the priority here should be:1. Release CEP-15 as part of 5.0, this year, with or without CEP-21.2. Minimize work arising from porting between branches. (e.g. first onto CEP-21, then to trunk, or vice versa. But also then between 5.0 and trunk)3. Minimize work arising from temporary solutions that support goal #1If CEP-21 is the major source of uncertainty right now, then we should merge CEP-15 to trunk independently. If CEP-15 depending on CEP-21 just means an additional month (or two) delay, then that's fine and there's no need to do additional work just to get some preview release out earlier.henrikOn Sat, Mar 25, 2023 at 4:17 AM Caleb Rackliffe <calebrackli...@gmail.com> wrote:I agree there’s little point in litigating right now, given test stability (or lack thereof) in cep-21-tcm. Eventually, though, I’m more or less aligned w/ David in the sense that getting ourselves planted on top of TCM as soon as possible is a good idea.On Mar 24, 2023, at 3:04 PM, Benedict <bened...@apache.org> wrote:It’s not even clear such an effort would need to be different from that used by cep-21. The point is that there’s not much point litigating this now when we can keep our options open and better decouple the projects, since I don’t think we lose much this way.On 24 Mar 2023, at 19:58, David Capwell <dcapw...@apple.com> wrote:Assuming we do not release with it, then yes, as we wouldn’t need to maintain. My point for this case was that I don’t feel the time cost is worth it, I am not -1 if someone wants to add, was more saying our time is better off else where.We currently don’t touch Transactional Metadata, we have custom logic (which relies on client to tell every instance in the cluster an update happened) that we are using right now in tests and deployed clusters. So once we can integrate with Transactional Metadata, we stop relying on clients to tell us about topology changes… a custom/thrown away epoch generator would make this current process more user friendly, but would need to spend time to make sure stable when deployed on a clusterOn Mar 24, 2023, at 12:43 PM, Josh McKenzie <jmcken...@apache.org> wrote:If this is in a release, we then need to maintain that feature, so would be against it.Isn't the argument that cep-21 provides this so we could just remove the temporary impl and point to the new facility for this generation?On Fri, Mar 24, 2023, at 3:22 PM, David Capwell wrote:the question we want to answer is whether or not we build a throwaway patch for linearizable epochsIf this is in a release, we then need to maintain that feature, so would be against it.If this is for testing, then I would argue the current world is “fine”… current world is hard to use and brittle (users need to tell accord that the cluster changed), but if accord is rebasing on txn metadata then this won’t be that way long (currently blocked from doing that due to txn metadata not passing all tests yet).On Mar 24, 2023, at 12:12 PM, Josh McKenzie <jmcken...@apache.org> wrote:FWIW, I'd still rather just integrate w/ TCM ASAP, avoiding integration risk while accepting the possible delivery risk.What does the chain of rebases against trunk look like here? cep-21-tcm rebase, then cep-15 on cep-21-tcm, then cep-7 on cep-21-tcm, then a race on whichever of 15 or 7 merge after 21 goes into trunk? Or 7 based on 15, the other way around...I'm not actively working on any of these branches so take my perspective with that appropriate grain of salt, but the coupling of these things seems to have it's own kind of breed of integration pain to upkeep over time depending on how frequently we're rebasing against trunk.the question we want to answer is whether or not we build a throwaway patch for linearizable epochsDo we have an informed opinion on how long we think this would take? Seems like that'd help clarify whether or not there's contributors with the bandwidth and desire to even do that or whether everyone depending on cep-21 is our option.On Fri, Mar 24, 2023, at 1:30 PM, Caleb Rackliffe wrote:I actually did a dry run rebase of cep-15-accord on top of cep-21-tcm here: https://github.com/apache/cassandra/pull/2227It wasn't too terrible, and I was actually able to get the main CQL-based Accord tests working as long as I disabled automatic forwarding of CAS and SERIAL read operations to Accord. The bigger issue was test stability in cep-21-tcm. I'm sure that will mature quickly here, and I created a few issues to attach to the Transactional Metadata epic.In the meantime, I rebased cep-15-accord on trunk at commit 3eb605b4db0fa6b1ab6
Re: [DISCUSS] CEP-28: Reading and Writing Cassandra Data with Spark Bulk Analytics
Fwiw I’m sceptical of the performance angle long term. You can do a lot more to control QoS when you understand what each query is doing, and what your SLOs are. You can also more efficiently apportion your resources (not leaving any lying fallow to ensure it’s free later) But, we’re a long way from that. My personal view of the sidecar is to offer these sorts of facilities more rapidly than we might in Cassandra proper, but that we might eventually (when mature enough and Cassandra is ready for it) bring them in process. Certainly, managing consistency (repair etc) and serving bulk operations should *long term* live in Cassandra IMO. But that isn’t the state of the world today, so I support a separate process. Though, I am nervous about the issues Jeremiah raises - we need to ensure we are not tightly coupling things and creating new problems. Managing other processes reliably and promptly seeing sstable changes and memtable flushes isn’t something that would be pretty, and we should probably offer weak guarantees about what’s visible when - ideally the sidecar would rely on file system watch notifications, or perhaps at most some fsync like functionality for flushing memtables. > On 28 Mar 2023, at 16:09, Joseph Lynch wrote: > > >> >> If we want to bring groups/containers/etc into the default deployment >> mechanisms of C*, great. I am all for dividing it up into micro services >> given we solve all the problems I listed in the complexity section. >> >> I am actually all for dividing C* up into multiple micro services, but the >> project needs to buy in to containers as the default mechanism for running >> it for that to be viable in my mind. > > I was under the impression that with CEP-1 the project did buy into > the direction of moving the workloads that are non-latency sensitive > out of the main process? At the time of the discussion folks mentioned > repair, bulk workloads, backup, restore, compaction etc ... as all > possible things we would like to extract over time to the sidecar. > > I don't think we want to go full on micro services, with like 12 > processes all handling one thing, but 2 seems like a good step? One > for latency sensitive requests (reads/writes - the current process), > and one for non latency sensitive requests (control plane, bulk work, > etc ... - the sidecar). > > -Joey
Re: [DISCUSS] CEP-28: Reading and Writing Cassandra Data with Spark Bulk Analytics
I disagree with the first claim, as the process has all the information it chooses to utilise about which resources it’s using and what it’s using those resources for.The inability to isolate GC domains is something we cannot address, but also probably not a problem if we were doing everything with memory management as well as we could be.But, not worth detailing this thread for. Today we do very little well on this front within the process, and a separate process is well justified given the state of play.On 28 Mar 2023, at 16:38, Derek Chen-Becker wrote:On Tue, Mar 28, 2023 at 9:03 AM Joseph Lynchwrote:... I think we might be underselling how valuable JVM isolation is, especially for analytics queries that are going to pass the entire dataset through heap somewhat constantly. Big +1 here. The JVM simply does not have significant granularity of control for resource utilization, but this is explicitly a feature of separate processes. Add in being able to separate GC domains and you can avoid a lot of noisy neighbor in-VM behavior for the disparate workloads.Cheers,Derek-- +---+| Derek Chen-Becker || GPG Key available at https://keybase.io/dchenbecker and || https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org || Fngrprnt: EB8A 6480 F0A3 C8EB C1E7 7F42 AFC5 AFEE 96E4 6ACC |+---+
Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE
KEYSPACE is fine. If we want to introduce a standard nomenclature like DATABASE that’s also fine. Inventing brand new ones is not fine, there’s no benefit. I think it would be fine to introduce some arbitrary unrelated concept for assigning tables with similar behaviours some configuration that is orthogonal to replication, but that should be a different discussion about how we evolve config. > On 6 Apr 2023, at 09:40, Mick Semb Wever wrote: > > >> Something like "TABLESPACE" or 'TABLEGROUP" would theoretically better >> satisfy point 1 and 2 above but subjectively I kind of recoil at both >> equally. So there's that. > > > > TABLEGROUP would work for me. Immediately intuitive. > > brain-storming… > > A keyspace today defines replication strategy, rf, and durable_writes. If > they also had the table options that could be defined as defaults for all > tables in that group, and one tablegroup could be a child and inherit > settings from another tablegroup, you could logically group tables in ways > that both benefit your application platform's taxonomy and the spread of > keyspace/table settings. DATABASE, NAMESPACE, whatever, can be aliases to it > too, if you like. > > >
Re: [DISCUSS] Next release date
Finally, I expect most Europeans to be on vacation 33% of that time. Non-Europeans may want to try it too!The more northerly Europeans maybe :)On 18 Apr 2023, at 01:24, Henrik Ingo wrote:Trying to collect a few loose ends from across this thread> I'm receptive to another definition of "stabilize", I think the stabilization period implies more than just CI, which is mostly a function of unit tests working correctly. For example, at Datastax we have run a "large scale" test with >100 nodes, over several weeks, both for 4.0 and 4.1. For obvious reasons such tests can't run in nightly CI builds.Also it is not unusual that during the testing phase developers or specialized QA engineers can develop new tests (which are possibly added to CI) to improve coverage for and especially targeting new features in the release. For example the fixes to Paxos v2 were found by such work before 4.1.Finally, maybe it's a special case relevant only for this release, but as a significant part of the Datastax team has been focused on porting these large existing features from DSE, and to get them merged before the original May date, we also have tens of bug fixes waiting to be upstreamed too. (It used to be an even 100, but I'm unsure what the count is today.)In fact! If you are worried about how to occupy yourself between a May "soft freeze" and September'ish hard freeze, you are welcome to chug on that backlog. The bug fixes are already public and ASL licensed, in the 4.0 based branch here.Failed with an unknown error.> 3a. If we allow merge of CEP-15 / CEP-21 after branch, we risk invalidating stabilization and risk our 2023 GA dateI think this is the assumption that I personally disagree with. If this is true, why do we even bother running any CI before the CEP-21 merge? It will all be invalidated anyway, right?In my experience, it is beneficial to test as early as possible, and at different checkpoints during development. If we wouldn't do it, and we find some issue in late November, then the window to search for the commit that introduced the regression is all the way back to the 4.1 GA. If on the other hand the same test was already rune during the soft freeze, then we can know that we may focus our search onto CEP-15 and CEP-21.> get comfortable with cutting feature previews or snapshot alphas like we agreed to for earlier access to new stuffSnapshots are in fact a valid compromise proposal: A snapshot would provide a constant version / point in time to focus testing on, but on the other hand would allow trunk (or the 5.0 branch, in other proposals) to remain open to new commits. Somewhat "invalidating" the testing work, but presumably the branch will be relatively calm anyway. Which leads me to 2 important questions:WHO would be actively merging things into 5.0 during June-August? By my count at that point I expect most contributors to either furiously work on Acccord and TCM, or work on stabilization (tests, fixes).Also, if someone did contribute new feature code during this time, they might find it hard to get priority for reviews, if others are focused on the above tasks.Finally, I expect most Europeans to be on vacation 33% of that time. Non-Europeans may want to try it too!WHAT do we expect to get merged during June-August?Compared to the tens of thousands of lines of code being merged by Accord, SAI, UCS and Tries... I imagine even the worst case during a non-freeze in June-August would be just a tiny percentage of the large CEPs.In this thread I only see Paulo announcing an intent to commit against trunk during a soft freeze, and even he agrees with a 5.0 branch freeze.This last question is basically a form of saying I hope we aren't discussing a problem that doesn't even exist?henrik-- Henrik Ingoc. +358 40 569 7354 w. www.datastax.com
Re: Adding vector search to SAI with heirarchical navigable small world graph index
We probably at least need to bike shed naming as we already have FLOAT, DOUBLE, and LIST - which are similar/overlapping types, and we shoo on should be consistent.If we introduce FLOAT32 we probably need that to be an alias of FLOAT and introduce FLOAT64 to alias DOUBLE for consistency.DENSE seems to just be an array? So very similar to a frozen list, but with a fixed size?On 26 Apr 2023, at 06:53, Mick Semb Wever wrote:I was soo happy when I saw this, I know many users are going to be thrilled about it.On Wed, 26 Apr 2023 at 05:15, Patrick McFadinwrote:Not sure if this is what you are saying, Josh, but I believe this needs to be its own CEP. It's a change in CQL syntax and changes how clusters operate. The change needs to be documented and voted on. Jonathan, you know how to find me if you want me to help write it. :) I'd be fine with just a DISCUSS thread to agree to the CQL change, since it: `DENSE FLOAT32` appears to be a minimal, and the overall patch building on SAI. As Henrik mentioned there's other SAI extensions being added too without CEPs. Can you elaborate on how you see this changing how the cluster operates?This will be easier to decide once we have a patch to look at, but that depends on a CEP-7 base (e.g. no feature branch exists). If we do want a CEP we need to allow a few weeks to get it through, but that can happen in parallel and maybe drafting up something now will be valuable anyway for an eventual CEP that proposes the more complete features (e.g. cosine_similarity(…)).
Re: [DISCUSS] New data type for vector search
That’s a bounded ring buffer, not a fixed length array.This definitely isn’t a tuple because the types are all the same, which is pretty crucial for matrix operations. Matrix libraries generally work on arrays of known dimensionality, or sparse representations.Whether we draw any semantic link between the frozen list and whatever we do here, it is fundamentally a frozen list with a restriction on its size. What we’re defining here are “statically” sized arrays, whereas a frozen list is essentially a dynamically sized array.I do not think vector is a good name because vector is used in some other popular languages to mean a (dynamic) list, which is confusing when we also have a list concept.I’m fine with just using the FLOAT[N] syntax, and drawing no direct link with list. Though it is a bit strange that this particular type declaration looks so different to other collection types.On 27 Apr 2023, at 16:48, Jeff Jirsa wrote:On Thu, Apr 27, 2023 at 7:39 AM Jonathan Elliswrote:It's been a while, so I may be missing something, but do we already have fixed-size lists? If not, I don't see why we'd try to make this fit into a List-shaped problem.We do not. The proposal got closed as wont-fix https://issues.apache.org/jira/browse/CASSANDRA-9110
Re: [DISCUSS] New data type for vector search
This feature may be targeting ML users but it isn’t part of some “ML plug-in” it’s a general purpose type available to all users that happens to permit the use of ANN. So it needs to make sense in a general context, not just to ML users.I also doubt users will struggle with understanding an array or similar type, but vector isn’t a hill I’m going to die on.Frozen, though, should be a requirement for this type and thereby implied by whatever name or syntax we conjur up. Implementing a fixed-size non-frozen type is much more complex, perhaps even nonsensical.Otherwise I broadly agree with David, and maybe lean towards stipulating nullability explicitly.On 28 Apr 2023, at 01:49, Josh McKenzie wrote:From a machine learning perspective, vectors are a well-known concept that are effectively immutable fixed-length n-dimensional values that are then later used either as part of a model or in conjunction with a model after the fact.While we could have this be non-frozen and not call it a vector, I'd be inclined to still make the argument for a layer of syntactic sugar on top that met ML users where they were with concepts they understood rather than forcing them through the cognitive lift of figuring out the Cassandra specific contortions to replicate something that's ubiquitous in their space. We did the same "Cassandra-first" approach with our JSON support and that didn't do us any favors in terms of adoption and usage as far as I know.So is the goal here to provide something specific and idiomatic for the ML community or is the goal to make a primitive that's C*-centric that then another layer can write to? I personally argue for the former; I don't see this specific data type going away any time soon.On Thu, Apr 27, 2023, at 12:39 PM, David Capwell wrote:but as you point out it has the problem of allowing nulls.If nulls are not allowed for the elements, then either we need a) a new type, or b) add some way to say elements may not be null…. As much as I do like b, I am leaning towards new type for this use case.So, to flesh out the type requirements I have seen so far1) represents a fixed size array of element type* on write path we will need to validate this2) element may not be null* on write path we will need to validate this3) “frozen” (is this really a requirement for the type or is this just simpler for the ANN work? I feel that this shouldn’t be a requirement)4) works for all types (my requirement; original proposal is float only, but could logically expand to primitive types)Anything else?The key thing about a vector is that unlike lists or tuples you really don't care about individual elements, you care about doing vector and matrix multiplications with the thing as a unit. That maybe true for this use case, but “should” this be true for the type itself? I feel like no… if a user wants the Nth element of a vector why would we block them? I am not saying the first patch, or even 5.0 adds support for index access, I am just trying to push back saying that the type should not block this.(Maybe this is making the case for VECTOR FLOAT[N] rather than FLOAT VECTOR[N].)Now that nulls are not allowed, I have mixed feelings about FLOAT[N], I prefer this syntax but that limitation may not be desired for all use cases… we could always add LIST and ARRAY later to address that case.In terms of syntax I have seen, here is my ordered preference:1) TYPE[size] - have mixed feelings due to non-null, but still prefer it2) QUALIFIER TYPE[size] - QUALIFIER is just a Term we use to denote this semantic…. Could even be NON NULL TYPE[size]On Apr 27, 2023, at 9:00 AM, Benedict wrote:That’s a bounded ring buffer, not a fixed length array.This definitely isn’t a tuple because the types are all the same, which is pretty crucial for matrix operations. Matrix libraries generally work on arrays of known dimensionality, or sparse representations.Whether we draw any semantic link between the frozen list and whatever we do here, it is fundamentally a frozen list with a restriction on its size. What we’re defining here are “statically” sized arrays, whereas a frozen list is essentially a dynamically sized array.I do not think vector is a good name because vector is used in some other popular languages to mean a (dynamic) list, which is confusing when we also have a list concept.I’m fine with just using the FLOAT[N] syntax, and drawing no direct link with list. Though it is a bit strange that this particular type declaration looks so different to other collection types.On 27 Apr 2023, at 16:48, Jeff Jirsa wrote:On Thu, Apr 27, 2023 at 7:39 AM Jonathan Ellis <jbel...@gmail.com> wrote:It's been a while, so I may be missing something, but do we already have fixed-size lists? If not, I don't see why we'd try to make this fit into a List-shaped problem.We do not. The proposal got closed as wont-fix https://issues.apache.org/jira/browse/CASSANDRA-9110
Re: [DISCUSS] New data type for vector search
But you’re proposing introducing a general purpose type - this isn’t an ML plug-in, it’s modifying the core language in a manner that makes targeting your workload easier. Which is fine, but that means you have to consider its impact on the general language, not just your target use case.On 28 Apr 2023, at 16:29, Jonathan Ellis wrote:That's exactly right.In particular it makes no sense at all from an ML perspective to have vector types of anything other than numerics. And as I mentioned in the POC thread (but I did not mention here), float is overwhelmingly the most frequently used vector type, to the point that Pinecone (by far the most popular vector search engine) ONLY supports that type.Lucene and Elastic also add support for vectors of bytes (8-bit ints), which are useful for optimizing models that you have already built with floats, but we have no reasonable path towards supporting indexing and searches against any other vector type.So in order of what makes sense to me:1. Add a vector type for just floats; consider adding bytes later if demand materializes. This gives us 99% of the value and limits the scope so we can deliver quickly.2. Add a vector type for floats or bytes. This gives us another 1% of value in exchange for an extra 20% or so of effort.3. Add a vector type for all numeric primitives, but you can only index floats and bytes. I think this is confusing to users and a bad idea.4. Add a vector type that composes with all Cassandra types. I can't see a reason to do this, nobody wants it, and we killed the most similar proposal in the past as wontfix.On Thu, Apr 27, 2023 at 7:49 PM Josh McKenzie <jmcken...@apache.org> wrote:From a machine learning perspective, vectors are a well-known concept that are effectively immutable fixed-length n-dimensional values that are then later used either as part of a model or in conjunction with a model after the fact.While we could have this be non-frozen and not call it a vector, I'd be inclined to still make the argument for a layer of syntactic sugar on top that met ML users where they were with concepts they understood rather than forcing them through the cognitive lift of figuring out the Cassandra specific contortions to replicate something that's ubiquitous in their space. We did the same "Cassandra-first" approach with our JSON support and that didn't do us any favors in terms of adoption and usage as far as I know.So is the goal here to provide something specific and idiomatic for the ML community or is the goal to make a primitive that's C*-centric that then another layer can write to? I personally argue for the former; I don't see this specific data type going away any time soon.On Thu, Apr 27, 2023, at 12:39 PM, David Capwell wrote:but as you point out it has the problem of allowing nulls.If nulls are not allowed for the elements, then either we need a) a new type, or b) add some way to say elements may not be null…. As much as I do like b, I am leaning towards new type for this use case.So, to flesh out the type requirements I have seen so far1) represents a fixed size array of element type* on write path we will need to validate this2) element may not be null* on write path we will need to validate this3) “frozen” (is this really a requirement for the type or is this just simpler for the ANN work? I feel that this shouldn’t be a requirement)4) works for all types (my requirement; original proposal is float only, but could logically expand to primitive types)Anything else?The key thing about a vector is that unlike lists or tuples you really don't care about individual elements, you care about doing vector and matrix multiplications with the thing as a unit. That maybe true for this use case, but “should” this be true for the type itself? I feel like no… if a user wants the Nth element of a vector why would we block them? I am not saying the first patch, or even 5.0 adds support for index access, I am just trying to push back saying that the type should not block this.(Maybe this is making the case for VECTOR FLOAT[N] rather than FLOAT VECTOR[N].)Now that nulls are not allowed, I have mixed feelings about FLOAT[N], I prefer this syntax but that limitation may not be desired for all use cases… we could always add LIST and ARRAY later to address that case.In terms of syntax I have seen, here is my ordered preference:1) TYPE[size] - have mixed feelings due to non-null, but still prefer it2) QUALIFIER TYPE[size] - QUALIFIER is just a Term we use to denote this semantic…. Could even be NON NULL TYPE[size]On Apr 27, 2023, at 9:00 AM, Benedict <bened...@apache.org> wrote:That’s a bounded ring buffer, not a fixed length array.This definitely isn’t a tuple because the types are all the same, which is pretty crucial for matrix operations. Matrix libraries generally work on arrays of known dimensionality, or sparse representations.Whether we draw any semantic link between the fro
Re: [DISCUSS] New data type for vector search
pgvector is a plug-in. If you were proposing a plug-in you could ignore these considerations.On 28 Apr 2023, at 16:58, Jonathan Ellis wrote:I'm proposing a vector data type for ML use cases. It's not the same thing as an array or a list and it's not supposed to be.While it's true that it would be possible to build a vector type on top of an array type, it's not necessary to do it that way, and given the lack of interest in an array type for its own sake I don't see why we would want to make that a requirement.It's relevant that pgvector, which among the systems offering vector search is based on the most similar system to Cassandra in terms of its query language, adds a vector data type that only supports floats *even though postgresql already has an array data type* because the semantics are different. Random access doesn't make sense, string and collection and other datatypes don't make sense, typical ordered indexes don't make sense, etc. It's just a different beast from arrays, for a different use case.On Fri, Apr 28, 2023 at 10:40 AM Benedict <bened...@apache.org> wrote:But you’re proposing introducing a general purpose type - this isn’t an ML plug-in, it’s modifying the core language in a manner that makes targeting your workload easier. Which is fine, but that means you have to consider its impact on the general language, not just your target use case.On 28 Apr 2023, at 16:29, Jonathan Ellis <jbel...@gmail.com> wrote:That's exactly right.In particular it makes no sense at all from an ML perspective to have vector types of anything other than numerics. And as I mentioned in the POC thread (but I did not mention here), float is overwhelmingly the most frequently used vector type, to the point that Pinecone (by far the most popular vector search engine) ONLY supports that type.Lucene and Elastic also add support for vectors of bytes (8-bit ints), which are useful for optimizing models that you have already built with floats, but we have no reasonable path towards supporting indexing and searches against any other vector type.So in order of what makes sense to me:1. Add a vector type for just floats; consider adding bytes later if demand materializes. This gives us 99% of the value and limits the scope so we can deliver quickly.2. Add a vector type for floats or bytes. This gives us another 1% of value in exchange for an extra 20% or so of effort.3. Add a vector type for all numeric primitives, but you can only index floats and bytes. I think this is confusing to users and a bad idea.4. Add a vector type that composes with all Cassandra types. I can't see a reason to do this, nobody wants it, and we killed the most similar proposal in the past as wontfix.On Thu, Apr 27, 2023 at 7:49 PM Josh McKenzie <jmcken...@apache.org> wrote:From a machine learning perspective, vectors are a well-known concept that are effectively immutable fixed-length n-dimensional values that are then later used either as part of a model or in conjunction with a model after the fact.While we could have this be non-frozen and not call it a vector, I'd be inclined to still make the argument for a layer of syntactic sugar on top that met ML users where they were with concepts they understood rather than forcing them through the cognitive lift of figuring out the Cassandra specific contortions to replicate something that's ubiquitous in their space. We did the same "Cassandra-first" approach with our JSON support and that didn't do us any favors in terms of adoption and usage as far as I know.So is the goal here to provide something specific and idiomatic for the ML community or is the goal to make a primitive that's C*-centric that then another layer can write to? I personally argue for the former; I don't see this specific data type going away any time soon.On Thu, Apr 27, 2023, at 12:39 PM, David Capwell wrote:but as you point out it has the problem of allowing nulls.If nulls are not allowed for the elements, then either we need a) a new type, or b) add some way to say elements may not be null…. As much as I do like b, I am leaning towards new type for this use case.So, to flesh out the type requirements I have seen so far1) represents a fixed size array of element type* on write path we will need to validate this2) element may not be null* on write path we will need to validate this3) “frozen” (is this really a requirement for the type or is this just simpler for the ANN work? I feel that this shouldn’t be a requirement)4) works for all types (my requirement; original proposal is float only, but could logically expand to primitive types)Anything else?The key thing about a vector is that unlike lists or tuples you really don't care about individual elements, you care about doing vector and matrix multiplications with the thing as a unit. That maybe true for this use case, but “should” this be true
Re: [DISCUSS] New data type for vector search
I and others have claimed that an array concept will work, since it is isomorphic with a vector. I have seen the following counterclaims:1. Vectors don’t need to support index lookups2. Vectors don’t need to support ordered indexes3. Vectors don’t need to support other types besides floatNone of these say that vectors are not arrays. At most these say “ANN indexes should only support float types” which is different, and not something I would dispute.If the claim is "there is no concept of arrays that is compatible with vector search" then let’s focus on that, because that is probably the initial source of the disconnect.On 28 Apr 2023, at 18:13, Henrik Ingo wrote:Benedict, I don't quite see why that matters? The argument is merely that this kind of vector, for this use case, a) is different from arrays, and b) arrays apparently don't serve the use case well enough (or at all).Now, if from the above it follows a discussion that a vector type cannot be a first class Cassandra type... that is of course a possible argument. But suggesting that Jonathan should work on implementing general purpose arrays seems to fall outside the scope of this discussion, since the result of such work wouldn't even fill the need Jonathan is targeting for here. I could also ask Jonathan to work on a JSONB data type, and it similarly would not be an interesting proposal to Jonathan, as it wouldn't fill the need for the specific use case he is targeting.But back to the main question... Why wouldn't a "vector for floats" type be general purpose enough that it should be delegated to some plugin? Machine Learning is a broad field in itself, with dozens of algorithms you could choose to use to build an AI model. And AI can be used in pretty much every industry vertical. If anything, I would claim DECIMAL is much more an industry specific special case type than these ML vectors would be. Back to Jonathan:>So in order of what makes sense to me:> 1. Add a vector type for just floats; consider adding bytes later if demand materializes. This gives us 99% of the value and limits the scope so we can deliver quickly.> 2. Add a vector type for floats or bytes. This gives us another 1% of value in exchange for an extra 20% or so of effort.Is it possible to implement 1 in a way that makes 2 possible in a future version?henrikhenrikOn Fri, Apr 28, 2023 at 7:33 PM Benedict <bened...@apache.org> wrote:pgvector is a plug-in. If you were proposing a plug-in you could ignore these considerations.On 28 Apr 2023, at 16:58, Jonathan Ellis <jbel...@gmail.com> wrote:I'm proposing a vector data type for ML use cases. It's not the same thing as an array or a list and it's not supposed to be.While it's true that it would be possible to build a vector type on top of an array type, it's not necessary to do it that way, and given the lack of interest in an array type for its own sake I don't see why we would want to make that a requirement.It's relevant that pgvector, which among the systems offering vector search is based on the most similar system to Cassandra in terms of its query language, adds a vector data type that only supports floats *even though postgresql already has an array data type* because the semantics are different. Random access doesn't make sense, string and collection and other datatypes don't make sense, typical ordered indexes don't make sense, etc. It's just a different beast from arrays, for a different use case.On Fri, Apr 28, 2023 at 10:40 AM Benedict <bened...@apache.org> wrote:But you’re proposing introducing a general purpose type - this isn’t an ML plug-in, it’s modifying the core language in a manner that makes targeting your workload easier. Which is fine, but that means you have to consider its impact on the general language, not just your target use case.On 28 Apr 2023, at 16:29, Jonathan Ellis <jbel...@gmail.com> wrote:That's exactly right.In particular it makes no sense at all from an ML perspective to have vector types of anything other than numerics. And as I mentioned in the POC thread (but I did not mention here), float is overwhelmingly the most frequently used vector type, to the point that Pinecone (by far the most popular vector search engine) ONLY supports that type.Lucene and Elastic also add support for vectors of bytes (8-bit ints), which are useful for optimizing models that you have already built with floats, but we have no reasonable path towards supporting indexing and searches against any other vector type.So in order of what makes sense to me:1. Add a vector type for just floats; consider adding bytes later if demand materializes. This gives us 99% of the value and limits the scope so we can deliver quickly.2. Add a vector type for floats or bytes. This gives us another 1% of value in exchange for an extra 20% or so of effort.3. Add a vector type for all numeric
Re: [DISCUSS] New data type for vector search
Has anybody yet claimed it would be hard? Several folk seem ready to jump to the conclusion that this would be onerous, but as somebody with a good understanding of the storage layer I can assert with reasonable confidence that it would not be. As previously stated, the implementation largely already exists for frozen lists. If we are going to let difficulty of implementation inform our CQL evolution, my view is that the bar for additional difficulty should be high, as CQL changes need to be well considered as they are not easily revisited - bad decisions survive indefinitely. The alternative as David points out is a plug-in system. So, maybe let’s wait until somebody makes a specific and serious claim of how challenging it would be, with justification, before we jump to compromising our language evolution based on it. I’m not even sure yet that this is really a consideration by anyone involved. > On 1 May 2023, at 18:41, Mick Semb Wever wrote: > > >> >> > But suggesting that Jonathan should work on implementing general purpose >> > arrays seems to fall outside the scope of this discussion, since the >> > result of such work wouldn't even fill the need Jonathan is targeting for >> > here. >> >> Every comment I have made so far I have argued that the v1 work doesn’t need >> to do some things, but that the limitations proposed so far are not real >> requirements; there is a big difference between what “could be allowed” and >> what is implemented day one… I am pushing back on what “could be allowed”, >> so far every justification has been that it slows down the ANN work… >> >> Simple examples of this already exists in C* (every example could be >> enhanced logically, we just have yet to put in the work) >> >> * updating a element of a list is only allowed for multi-cell >> * appending to a list is only allowed for multi-cell >> * etc. >> >> By saying that the type "shall not support", you actively block future work >> and future possibilities... > > > > I am coming around strongly to the `VECTOR FLOAT[n]` option. > > This gives Jonathan the simplest path right now with ths ANN work, while also > ensuring the CQL API gets the best future potential. > > With `VECTOR FLOAT[n]` the 'vector' is the ml sugar that means non-null and > frozen, and that allows both today and in the future, as desired, for its > implementation to be entirely different to `FLOAT[n]`. This addresses a > number of people's concerns that we meet ML's idioms head on. > > IMHO it feels like it will fit into the ideal future CQL , where all > `primitive[N]` are implemented, and where we have VECTOR FLOAT[n] (and maybe > VECTOR BYTE[n]). This will also permit in the future `FROZEN` > if we wanted nulls in frozen arrays. > > I think it is totally reasonable that the ANN patch (and Jonathan) is not > asked to implement on top of, or towards, other array (or other) new data > types. > > I also think it is correct that we think about the evolution of CQL's API, > and how it might exist in the future when we have both ml vectors and general > use arrays.
Re: [DISCUSS] New data type for vector search
I have explained repeatedly why I am opposed to ML-specific data types. If we want to make an ML-specific data type, it should be in an ML plug-in. We should not pollute the general purpose language with hastily-considered features that target specific bandwagons - at best partially - no matter how exciting the bandwagon.I think a simple and easy case can be made for fixed length array types that do not seem to create random bits of cruft in the language that dangle by themselves should this play not pan out. This is an easy way for this effort to make progress without negatively impacting the language.That is, unless we want to start supporting totally random types for every use case at the top level language layer. I don’t think this is a good idea, personally, and I’m quite confident we would now be regretting this approach had it been taken for earlier bandwagons.Nor do I think anyone’s priors about how successful this effort will be should matter. As a matter of principle, we should simply never deliver a specialist functionality as a high level CQL language feature without at least baking it for several years as a plug-in.On 1 May 2023, at 21:03, Mick Semb Wever wrote:Yes! What you (David) and Benedict write beautifully supports `VECTOR FLOAT[n]` imho.You are definitely bringing up valid implementation details, and that can be dealt with during patch review. This thread is about the CQL API addition. No matter which way the technical review goes with the implementation details, `VECTOR FLOAT[n]` does not limit it, and gives us the most ML idiomatic approach and the best long-term CQL API. It's a win-win situation – no matter how you look at it imho it is the best solution api wise. Unless the suggestion is that an ideal implementation can give us a better CQL API – but I don't see what that could be. Maybe the suggestion is we deny the possibility of using the VECTOR keyword and bring us back to something like `NON-NULL FROZEN`. This is odd to me because `VECTOR` here can be just an alias for `NON-NULL FROZEN` while meeting the patch's audience and their idioms. I have no problems with introducing such an alias to meet the ML crowd.Another way I think of this is `VECTOR FLOAT[n]` is the porcelain ML cql api, `NON-NULL FROZEN` and `FROZEN` and `FLOAT[n]` are the general-use plumbing cql apis. This would allow implementation details to be moved out of this thread and to the review phase.On Mon, 1 May 2023 at 20:57, David Capwell <dcapw...@apple.com> wrote:> I think it is totally reasonable that the ANN patch (and Jonathan) is not asked to implement on top of, or towards, other array (or other) new data types. This impacts serialization, if you do not think about this day 1 you then can’t add later on without having to worry about migration and versioning… Honestly I wanted to better understand the cost to be generic and the impact to ANN, so I took https://github.com/jbellis/cassandra/blob/vsearch/src/java/org/apache/cassandra/db/marshal/VectorType.java and made it handle every requirement I have listed so far (size, null, all types)… the current patch has several bugs at the type level that would need to be fixed, so had to fix those as well…. Total time to do this was 10 minutes… and this includes adding a method "public float[] composeAsFloats(ByteBuffer bytes)” which made the change to existing logic small (change VectorType.Serializer.instance.deserialize(buffer) to type.composeAsFloats(buffer))…. Did this have any impact to the final ByteBuffer? Nope, it had identical layout for the FloatType case, but works for all types…. I didn’t change the fact we store the size (felt this could be removed, but then we could never support expanding the vector in the future…) So, given the fact it takes a few minutes to implement all these requirements, I do find it very reasonable to push back and say we should make sure the new type is not leaking details from a special ANN index…. We have spent more time debating this than it takes to support… we also have fuzz testing on trunk so just updating org.apache.cassandra.utils.AbstractTypeGenerators to know about this new type means we get type coverage as well… I have zero issues helping to review this patch and make sure the testing is on-par with existing types (this is a strong requirement for me) > On May 1, 2023, at 10:40 AM, Mick Semb Wever <m...@apache.org> wrote: > > > > But suggesting that Jonathan should work on implementing general purpose arrays seems to fall outside the scope of this discussion, since the result of such work wouldn't even fill the need Jonathan is targeting for here. > > Every comment I have made so far I have argued that the v1 work doesn’t need to do some things, but that the limitations proposed so far are not real requirements; there is a big difference between what “could be allowed” and what is implemented day one… I am pushing back on what
Re: [DISCUSS] New data type for vector search
A data type plug-in is actually really easy today, I think? But, developing further hooks should probably be thought through as they’re necessary. I think in this case it would be simpler to deliver a general purpose type, which is why I’m trying to propose types that would be acceptable.I also think we’re pretty close to agreement, really?But if not, let’s flesh out potential plug-in requirements.On 1 May 2023, at 21:58, Josh McKenzie wrote:If we want to make an ML-specific data type, it should be in an ML plug-in.How can we encourage a healthier plug-in ecosystem? As far as I know it's been pretty anemic historically:cassandra: https://cassandra.apache.org/doc/latest/cassandra/plugins/index.htmlpostgres: https://www.postgresql.org/docs/current/contrib.htmlI'm really interested to hear if there's more in the ecosystem I'm not aware of or if there's been strides made in this regard; users in the ecosystem being able to write durable extensions to Cassandra that they can then distribute and gain momentum could potentially be a great incubator for new features or functionality in the ecosystem.If our support for extensions remains as bare as I believe it to be, I wouldn't recommend anyone go that route.On Mon, May 1, 2023, at 4:17 PM, Benedict wrote:I have explained repeatedly why I am opposed to ML-specific data types. If we want to make an ML-specific data type, it should be in an ML plug-in. We should not pollute the general purpose language with hastily-considered features that target specific bandwagons - at best partially - no matter how exciting the bandwagon.I think a simple and easy case can be made for fixed length array types that do not seem to create random bits of cruft in the language that dangle by themselves should this play not pan out. This is an easy way for this effort to make progress without negatively impacting the language.That is, unless we want to start supporting totally random types for every use case at the top level language layer. I don’t think this is a good idea, personally, and I’m quite confident we would now be regretting this approach had it been taken for earlier bandwagons.Nor do I think anyone’s priors about how successful this effort will be should matter. As a matter of principle, we should simply never deliver a specialist functionality as a high level CQL language feature without at least baking it for several years as a plug-in.On 1 May 2023, at 21:03, Mick Semb Wever wrote:Yes! What you (David) and Benedict write beautifully supports `VECTOR FLOAT[n]` imho.You are definitely bringing up valid implementation details, and that can be dealt with during patch review. This thread is about the CQL API addition. No matter which way the technical review goes with the implementation details, `VECTOR FLOAT[n]` does not limit it, and gives us the most ML idiomatic approach and the best long-term CQL API. It's a win-win situation – no matter how you look at it imho it is the best solution api wise. Unless the suggestion is that an ideal implementation can give us a better CQL API – but I don't see what that could be. Maybe the suggestion is we deny the possibility of using the VECTOR keyword and bring us back to something like `NON-NULL FROZEN`. This is odd to me because `VECTOR` here can be just an alias for `NON-NULL FROZEN` while meeting the patch's audience and their idioms. I have no problems with introducing such an alias to meet the ML crowd.Another way I think of this is `VECTOR FLOAT[n]` is the porcelain ML cql api, `NON-NULL FROZEN` and `FROZEN` and `FLOAT[n]` are the general-use plumbing cql apis. This would allow implementation details to be moved out of this thread and to the review phase.On Mon, 1 May 2023 at 20:57, David Capwell <dcapw...@apple.com> wrote:> I think it is totally reasonable that the ANN patch (and Jonathan) is not asked to implement on top of, or towards, other array (or other) new data types. This impacts serialization, if you do not think about this day 1 you then can’t add later on without having to worry about migration and versioning… Honestly I wanted to better understand the cost to be generic and the impact to ANN, so I took https://github.com/jbellis/cassandra/blob/vsearch/src/java/org/apache/cassandra/db/marshal/VectorType.java and made it handle every requirement I have listed so far (size, null, all types)… the current patch has several bugs at the type level that would need to be fixed, so had to fix those as well…. Total time to do this was 10 minutes… and this includes adding a method "public float[] composeAsFloats(ByteBuffer bytes)” which made the change to existing logic small (change VectorType.Serializer.instance.deserialize(buffer) to type.composeAsFloats(buffer))…. Did this have any impact to the final ByteBuffer? Nope, it had identical layout for the FloatType case, but works for all types…. I didn’t change the fact we store the size (
Re: [DISCUSS] New data type for vector search
If we agree we’re delivering some general purpose array type, that supports all types as elements (ie, is logicaly equivalent to a frozen list of fixed length, however it is actually implemented), I think we are in technical agreement and it’s just a matter of presentation.At which point I think we should simply collect the possible syntax options and put them to a poll. I’m not keen on vector for previously stated reasons, but it’s probably not worth litigating further and we should let the silent majority adjudicate.On 2 May 2023, at 12:43, Jonathan Ellis wrote:To make sure I understand correctly -- are you saying that you're fine with a vector type, but you want to see it implemented as a special case of arrays, or that you are not fine with a vector type because you would prefer to only add arrays and that should be "good enough" for ML?On Mon, May 1, 2023 at 4:27 PM Benedict <bened...@apache.org> wrote:A data type plug-in is actually really easy today, I think? But, developing further hooks should probably be thought through as they’re necessary. I think in this case it would be simpler to deliver a general purpose type, which is why I’m trying to propose types that would be acceptable.I also think we’re pretty close to agreement, really?But if not, let’s flesh out potential plug-in requirements.On 1 May 2023, at 21:58, Josh McKenzie <jmcken...@apache.org> wrote:If we want to make an ML-specific data type, it should be in an ML plug-in.How can we encourage a healthier plug-in ecosystem? As far as I know it's been pretty anemic historically:cassandra: https://cassandra.apache.org/doc/latest/cassandra/plugins/index.htmlpostgres: https://www.postgresql.org/docs/current/contrib.htmlI'm really interested to hear if there's more in the ecosystem I'm not aware of or if there's been strides made in this regard; users in the ecosystem being able to write durable extensions to Cassandra that they can then distribute and gain momentum could potentially be a great incubator for new features or functionality in the ecosystem.If our support for extensions remains as bare as I believe it to be, I wouldn't recommend anyone go that route.On Mon, May 1, 2023, at 4:17 PM, Benedict wrote:I have explained repeatedly why I am opposed to ML-specific data types. If we want to make an ML-specific data type, it should be in an ML plug-in. We should not pollute the general purpose language with hastily-considered features that target specific bandwagons - at best partially - no matter how exciting the bandwagon.I think a simple and easy case can be made for fixed length array types that do not seem to create random bits of cruft in the language that dangle by themselves should this play not pan out. This is an easy way for this effort to make progress without negatively impacting the language.That is, unless we want to start supporting totally random types for every use case at the top level language layer. I don’t think this is a good idea, personally, and I’m quite confident we would now be regretting this approach had it been taken for earlier bandwagons.Nor do I think anyone’s priors about how successful this effort will be should matter. As a matter of principle, we should simply never deliver a specialist functionality as a high level CQL language feature without at least baking it for several years as a plug-in.On 1 May 2023, at 21:03, Mick Semb Wever <m...@apache.org> wrote:Yes! What you (David) and Benedict write beautifully supports `VECTOR FLOAT[n]` imho.You are definitely bringing up valid implementation details, and that can be dealt with during patch review. This thread is about the CQL API addition. No matter which way the technical review goes with the implementation details, `VECTOR FLOAT[n]` does not limit it, and gives us the most ML idiomatic approach and the best long-term CQL API. It's a win-win situation – no matter how you look at it imho it is the best solution api wise. Unless the suggestion is that an ideal implementation can give us a better CQL API – but I don't see what that could be. Maybe the suggestion is we deny the possibility of using the VECTOR keyword and bring us back to something like `NON-NULL FROZEN`. This is odd to me because `VECTOR` here can be just an alias for `NON-NULL FROZEN` while meeting the patch's audience and their idioms. I have no problems with introducing such an alias to meet the ML crowd.Another way I think of this is `VECTOR FLOAT[n]` is the porcelain ML cql api, `NON-NULL FROZEN` and `FROZEN` and `FLOAT[n]` are the general-use plumbing cql apis. This would allow implementation details to be moved out of this thread and to the review phase.On Mon, 1 May 2023 at 20:57, David Capwell <dcapw...@apple.com> wrote:> I think it is totally reasonable that the ANN patch (and Jonathan) is not asked to implement on top of, or towards, other array (or other) new data types. This
Re: [POLL] Vector type for ML
This is not the poll I thought we would be conducting, and I don’t really support its framing. There are two parallel questions: what the functionality should be and how they should be exposed. This poll compresses the optionality poorly.Whether or not we support a “vector” concept (or something isomorphic with it), the first question this poll wants to answer is:A) Should we introduce a new CQL collection type that is unique to ML and *only* supports float32B) Should we introduce a type that is general purpose, and supports all Cassandra types, so that this may be used to support ML (and perhaps other) workloadsC) Should we not introduce new types to CQL at allFor this question, I vote B only.Once this question is answered it makes sense to answer how it will be exposed semantically/syntactically. On 2 May 2023, at 16:43, Jonathan Ellis wrote:My preference: A > B > C. Vectors are distinct enough from arrays that we should not make adding the latter a prerequisite for adding the former.On Tue, May 2, 2023 at 10:13 AM Jonathan Elliswrote:Should we add a vector type to Cassandra designed to meet the needs of machine learning use cases, specifically feature and embedding vectors for training, inference, and vector search? ML vectors are fixed-dimension (fixed-length) sequences of numeric types, with no nulls allowed, and with no need for random access. The ML industry overwhelmingly uses float32 vectors, to the point that the industry-leading special-purpose vector database ONLY supports that data type.This poll is to gauge consensus subsequent to the recent discussion thread at https://lists.apache.org/thread/0lj1nk9jbhkf1rlgqcvxqzfyntdjrnk0.Please rank the discussed options from most preferred option to least, e.g., A > B > C (A is my preference, followed by B, followed by C) or C > B = A (C is my preference, followed by B or A approximately equally.)(A) I am in favor of adding a vector type for floats; I do not believe we need to tie it to any particular implementation details.(B) I am okay with adding a vector type but I believe we must add array types that compose with all Cassandra types first, and make vectors a special case of arrays-without-null-elements.(C) I am not in favor of adding a built-in vector type.-- Jonathan Ellisco-founder, http://www.datastax.com@spyced -- Jonathan Ellisco-founder, http://www.datastax.com@spyced
Re: [POLL] Vector type for ML
Could folk voting against a general purpose type (that could well be called a vector) briefly explain their reasoning?We established in the other thread that it’s technically trivial, meaning folk must think it is strictly superior to only support float rather than eg all numeric types (note: for the type, not the ANN). I am surprised, and the blurbs accompanying votes so far don’t seem to touch on this, mostly just endorsing the idea of a vector.On 2 May 2023, at 20:20, Patrick McFadin wrote:A > B > C on both polls. Having talked to several users in the community that are highly excited about this change, this gets to what developers want to do at Cassandra scale: store embeddings and retrieve them. On Tue, May 2, 2023 at 11:47 AM Andrés de la Peñawrote:A > B > CI don't think that ML is such a niche application that it can't have its own CQL data type. Also, vectors are mathematical elements that have more applications that ML.On Tue, 2 May 2023 at 19:15, Mick Semb Wever wrote:On Tue, 2 May 2023 at 17:14, Jonathan Ellis wrote:Should we add a vector type to Cassandra designed to meet the needs of machine learning use cases, specifically feature and embedding vectors for training, inference, and vector search? ML vectors are fixed-dimension (fixed-length) sequences of numeric types, with no nulls allowed, and with no need for random access. The ML industry overwhelmingly uses float32 vectors, to the point that the industry-leading special-purpose vector database ONLY supports that data type.This poll is to gauge consensus subsequent to the recent discussion thread at https://lists.apache.org/thread/0lj1nk9jbhkf1rlgqcvxqzfyntdjrnk0.Please rank the discussed options from most preferred option to least, e.g., A > B > C (A is my preference, followed by B, followed by C) or C > B = A (C is my preference, followed by B or A approximately equally.)(A) I am in favor of adding a vector type for floats; I do not believe we need to tie it to any particular implementation details.(B) I am okay with adding a vector type but I believe we must add array types that compose with all Cassandra types first, and make vectors a special case of arrays-without-null-elements.(C) I am not in favor of adding a built-in vector type.A > B > CB is stated as "must add array types…". I think this is a bit loaded. If B was the (A + the implementation needs to be a non-null frozen float32 array, serialisation forward compatible with other frozen arrays later implemented) I would put this before (A). Especially because it's been shown already this is easy to implement.
Re: [POLL] Vector type for ML
But it’s so trivial it was already implemented by David in the span of ten minutes? If anything, we’re slowing progress down by refusing to do the extra types, as we’re busy arguing about it rather than delivering a feature?FWIW, my interpretation of the votes today is that we SHOULD NOT (ever) support types beyond float. Not that we should start with float.So, this whole debate is a mess, I think. But hey ho.On 2 May 2023, at 20:57, Patrick McFadin wrote:I'll speak up on that one. If you look at my ranked voting, that is where my head is. I get accused of scope creep (a lot) and looking at the initial proposal Jonathan put on the ML it was mostly "Developers are adopting vector search at a furious pace and I think I have a simple way of adding support to keep Cassandra relevant for these use cases" Instead of just focusing on this use case, I feel the arguments have bike shedded into scope creep which means it will take forever to get into the project.My preference is to see one thing validated with an MVP and get it into the hands of developers sooner so we can continue to iterate based on actual usage. It doesn't say your points are wrong or your opinions are broken, I'm voting for what I think will be awesome for users sooner. PatrickOn Tue, May 2, 2023 at 12:29 PM Benedict <bened...@apache.org> wrote:Could folk voting against a general purpose type (that could well be called a vector) briefly explain their reasoning?We established in the other thread that it’s technically trivial, meaning folk must think it is strictly superior to only support float rather than eg all numeric types (note: for the type, not the ANN). I am surprised, and the blurbs accompanying votes so far don’t seem to touch on this, mostly just endorsing the idea of a vector.On 2 May 2023, at 20:20, Patrick McFadin <pmcfa...@gmail.com> wrote:A > B > C on both polls. Having talked to several users in the community that are highly excited about this change, this gets to what developers want to do at Cassandra scale: store embeddings and retrieve them. On Tue, May 2, 2023 at 11:47 AM Andrés de la Peña <adelap...@apache.org> wrote:A > B > CI don't think that ML is such a niche application that it can't have its own CQL data type. Also, vectors are mathematical elements that have more applications that ML.On Tue, 2 May 2023 at 19:15, Mick Semb Wever <m...@apache.org> wrote:On Tue, 2 May 2023 at 17:14, Jonathan Ellis <jbel...@gmail.com> wrote:Should we add a vector type to Cassandra designed to meet the needs of machine learning use cases, specifically feature and embedding vectors for training, inference, and vector search? ML vectors are fixed-dimension (fixed-length) sequences of numeric types, with no nulls allowed, and with no need for random access. The ML industry overwhelmingly uses float32 vectors, to the point that the industry-leading special-purpose vector database ONLY supports that data type.This poll is to gauge consensus subsequent to the recent discussion thread at https://lists.apache.org/thread/0lj1nk9jbhkf1rlgqcvxqzfyntdjrnk0.Please rank the discussed options from most preferred option to least, e.g., A > B > C (A is my preference, followed by B, followed by C) or C > B = A (C is my preference, followed by B or A approximately equally.)(A) I am in favor of adding a vector type for floats; I do not believe we need to tie it to any particular implementation details.(B) I am okay with adding a vector type but I believe we must add array types that compose with all Cassandra types first, and make vectors a special case of arrays-without-null-elements.(C) I am not in favor of adding a built-in vector type.A > B > CB is stated as "must add array types…". I think this is a bit loaded. If B was the (A + the implementation needs to be a non-null frozen float32 array, serialisation forward compatible with other frozen arrays later implemented) I would put this before (A). Especially because it's been shown already this is easy to implement.
Re: [POLL] Vector type for ML
Hurrah for initial agreement. For syntax, I think one option was just FLOAT[N]. In VECTOR FLOAT[N], VECTOR is redundant - FLOAT[N] is fully descriptive by itself. I don’t think VECTOR should be used to simply imply non-null, as this would be very unintuitive. More logical would be NONNULL, if this is the only condition being applied. Alternatively for arrays we could default to NONNULL and later introduce NULLABLE if we want to permit nulls. If the word vector is to be used it makes more sense to make it look like a list, so VECTOR as here the word VECTOR is clearly not redundant. So, I vote: 1) (NON NULL) FLOAT[N] 2) FLOAT[N] (Non null by default) 3) VECTOR > On 4 May 2023, at 08:52, Mick Semb Wever wrote: > > >>> Did we agree on a CQL syntax? >> I don’t believe there has been a pool on CQL syntax… my understanding >> reading all the threads is that there are ~4-5 options and non are -1ed, so >> believe we are waiting for majority rule on this? > > > Re-reading that thread, IIUC the valid choices remaining are… > > 1. VECTOR FLOAT[n] > 2. FLOAT VECTOR[n] > 3. VECTOR > 4. VECTOR[n] > 5. ARRAY > 6. NON-NULL FROZEN > > > Yes I'm putting my preference (1) first ;) because (banging on) if the future > of CQL will have FLOAT[n] and FROZEN, where the VECTOR keyword is: > for general cql users; just meaning "non-null and frozen", these gel best > together. > > Options (5) and (6) are for those that feel we can and should provide this > type without introducing the vector keyword. > >
Re: [POLL] Vector type for ML
I would expect that the type of index would be specified anyway?I don’t think it’s good API design to have the field define the index you create - only to shape what is permitted.A HNSW index is very specific and should be asked for specifically, not implicitly, IMO.On 4 May 2023, at 11:47, Mike Adamson wrote:For syntax, I think one option was just FLOAT[N]. In VECTOR FLOAT[N], VECTOR is redundant - FLOAT[N] is fully descriptive by itself. I don’t think VECTOR should be used to simply imply non-null, as this would be very unintuitive. More logical would be NONNULL, if this is the only condition being applied. Alternatively for arrays we could default to NONNULL and later introduce NULLABLE if we want to permit nulls.I have a small issue relating to not having a specific VECTOR tag on the data type. The driver behind adding this datatype is the hnsw index that is being added to consume this data. If we have a generic array datatype, what is the expectation going to be for users who create an index on it? The hnsw index will support only floats initially so we would have to reject any non-float arrays if an attempt was made to create an hnsw index on it. While there is no problem with doing this, there would be a problem if, in the future, we allow indexing in arrays in the same way that we index collections. In this case we would then need to have the user select what type of index they want at creation time.Can I add another proposal that we allow a VECTOR or DENSE (this is a well known term in the ML space) keyword that could be used when the array is going to be used for ML workloads. This would be optional and would function similarly to FROZEN in that it would limit the functionality of the array to ML usage. On Thu, 4 May 2023 at 09:45, Benedict <bened...@apache.org> wrote:Hurrah for initial agreement.For syntax, I think one option was just FLOAT[N]. In VECTOR FLOAT[N], VECTOR is redundant - FLOAT[N] is fully descriptive by itself. I don’t think VECTOR should be used to simply imply non-null, as this would be very unintuitive. More logical would be NONNULL, if this is the only condition being applied. Alternatively for arrays we could default to NONNULL and later introduce NULLABLE if we want to permit nulls.If the word vector is to be used it makes more sense to make it look like a list, so VECTOR as here the word VECTOR is clearly not redundant.So, I vote:1) (NON NULL) FLOAT[N]2) FLOAT[N] (Non null by default)3) VECTOROn 4 May 2023, at 08:52, Mick Semb Wever <m...@apache.org> wrote:Did we agree on a CQL syntax?I don’t believe there has been a pool on CQL syntax… my understanding reading all the threads is that there are ~4-5 options and non are -1ed, so believe we are waiting for majority rule on this?Re-reading that thread, IIUC the valid choices remaining are…1. VECTOR FLOAT[n]2. FLOAT VECTOR[n]3. VECTOR4. VECTOR[n]5. ARRAY6. NON-NULL FROZENYes I'm putting my preference (1) first ;) because (banging on) if the future of CQL will have FLOAT[n] and FROZEN, where the VECTOR keyword is: for general cql users; just meaning "non-null and frozen", these gel best together.Options (5) and (6) are for those that feel we can and should provide this type without introducing the vector keyword. -- Mike AdamsonEngineering+1 650 389 6000 | datastax.comFind DataStax Online:
Re: CEP-30: Approximate Nearest Neighbor(ANN) Vector Search via Storage-Attached Indexes
HNSW can in principle be made into a distributed index. But that would be quite a different paradigm to SAI.On 9 May 2023, at 19:30, Patrick McFadin wrote:Under the goals section, there is this line:Scatter/gather across replicas, combining topK from each to get global topK.But what I'm hearing is, exactly how will that happen? Maybe this is an SAI question too. How is that verified in SAI?On Tue, May 9, 2023 at 11:07 AM David Capwellwrote:Approach section doesn’t go over how this will handle cross replica search, this would be good to flesh out… given results have a real ranking, the current 2i logic may yield incorrect results… so would think we need num_ranges / rf queries in the best case, with some new capability to sort the results? If my assumption is correct, then how errors are handled should also be fleshed out… Example: 1k cluster without vnode and RF=3, so 333 queries fanned out to match, then coordinator needs to sort… if 1 of the queries fails and can’t fall back to peers… does the query fail (I assume so)?On May 8, 2023, at 7:20 PM, Jonathan Ellis wrote:Hi all,Following the recent discussion threads, I would like to propose CEP-30 to add Approximate Nearest Neighbor (ANN) Vector Search via Storage-Attached Indexes (SAI) to Apache Cassandra.The primary goal of this proposal is to implement ANN vector search capabilities, making Cassandra more useful to AI developers and organizations managing large datasets that can benefit from fast similarity search.The implementation will leverage Lucene's Hierarchical Navigable Small World (HNSW) library and introduce a new CQL data type for vector embeddings, a new SAI index for ANN search functionality, and a new CQL operator for performing ANN search queries.We are targeting the 5.0 release for this feature, in conjunction with the release of SAI. The proposed changes will maintain compatibility with existing Cassandra functionality and compose well with the already-approved SAI features.Please find the full CEP document here: https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-30%3A+Approximate+Nearest+Neighbor%28ANN%29+Vector+Search+via+Storage-Attached+Indexes-- Jonathan Ellisco-founder, http://www.datastax.com@spyced
Re: [DISCUSS] The future of CREATE INDEX
I’m not convinced by the changing defaults argument here. The characteristics of the two index types are very different, and users with scripts that make indexes today shouldn’t have their behaviour change.We could introduce new syntax that properly appreciates there’s no default index, perhaps CREATE LOCAL [type] INDEX? To also make clear that these indexes involve a partition key or scatter gatherOn 10 May 2023, at 06:26, guo Maxwell wrote:+1 , as we must Improve the image of your own default indexing ability.and As for CREATE CUSTOM INDEX , should we just left as it is and we can disable the ability for create SAI through CREATE CUSTOM INDEX in some version after 5.0? for as I know there may be users using this as a plugin-index interface, like https://github.com/Stratio/cassandra-lucene-index (though these project may be inactive, But if someone wants to do something similar in the future, we don't have to stop).Jonathan Ellis于2023年5月10日周三 10:01写道:+1 for this, especially in the long term. CREATE INDEX should do the right thing for most people without requiring extra ceremony.On Tue, May 9, 2023 at 5:20 PM Jeremiah D Jordan wrote:If the consensus is that SAI is the right default index, then we should just change CREATE INDEX to be SAI, and legacy 2i to be a CUSTOM INDEX.On May 9, 2023, at 4:44 PM, Caleb Rackliffe wrote:Earlier today, Mick started a thread on the future of our index creation DDL on Slack:https://the-asf.slack.com/archives/C018YGVCHMZ/p1683527794220019At the moment, there are two ways to create a secondary index.1.) CREATE INDEX [IF NOT EXISTS] [name] ON ()This creates an optionally named legacy 2i on the provided table and column. ex. CREATE INDEX my_index ON kd.tbl(my_text_col)2.) CREATE CUSTOM INDEX [IF NOT EXISTS] [name] ON () USING [WITH OPTIONS = ]This creates a secondary index on the provided table and column using the specified 2i implementation class and (optional) parameters. ex. CREATE CUSTOM INDEX my_index ON ks.tbl(my_text_col) USING 'StorageAttachedIndex'(Note that the work on SAI added aliasing, so `StorageAttachedIndex` is shorthand for the fully-qualified class name, which is also valid.)So what is there to discuss?The concern Mick raised is..."...just folk continuing to use CREATE INDEX because they think CREATE CUSTOM INDEX is advanced (or just don't know of it), and we leave users doing 2i (when they think they are, and/or we definitely want them to be, using SAI)"To paraphrase, we want people to use SAI once it's available where possible, and the default behavior of CREATE INDEX could be at odds w/ that.The proposal we seem to have landed on is something like the following:For 5.0:1.) Disable by default the creation of new legacy 2i via CREATE INDEX.2.) Leave CREATE CUSTOM INDEX...USING... available by default.(Note: How this would interact w/ the existing secondary_indexes_enabled YAML options isn't clear yet.)Post-5.0:1.) Deprecate and eventually remove SASI when SAI hits full feature parity w/ it.2.) Replace both CREATE INDEX and CREATE CUSTOM INDEX w/ something of a hybrid between the two. For example, CREATE INDEX...USING...WITH. This would both be flexible enough to accommodate index implementation selection and prescriptive enough to force the user to make a decision (and wouldn't change the legacy behavior of the existing CREATE INDEX). In this world, creating a legacy 2i might look something like CREATE INDEX...USING `legacy`.3.) Eventually deprecate CREATE CUSTOM INDEX...USING.Eventually we would have a single enabled DDL statement for index creation that would be minimal but also explicit/able to handle some evolution.What does everyone think? -- Jonathan Ellisco-founder, http://www.datastax.com@spyced -- you are the apple of my eye !
Re: [DISCUSS] The future of CREATE INDEX
This creates huge headaches for everyone successfully using 2i today though, and SAI *is not* guaranteed to perform as well or better - it has a very different performance profile.I think we should deprecate CREATE INDEX, and introduce new syntax CREATE LOCAL INDEX to make clear that this is not a global index, and that this should require the USING syntax to avoid this problem in future. We should report warnings to the client when CREATE INDEX is used, indicating it is deprecated.On 12 May 2023, at 13:10, Mick Semb Wever wrote:On Thu, 11 May 2023 at 05:27, Patrick McFadinwrote:Having pulled a lot of developers out of the 2i fire,Yes. I'm keen not to leave 2i as the default once SAI lands. Otherwise I agree with the deprecated first principle, but 2i is just too problematic. Just having no default in 5.0, forcing the user to evaluate which index to use would be an improvement.For example, if the default index in cassandra.yaml option exists but is commented out, that would prevent `CREATE INDEX` from working without specifying a `USING`. Then the yaml documentation would be clear about choices. I'd be ok with that for 5.0, and then make sai the default in the following release.Note, having the option in cassandra.yaml is problematic, as this is not a per-node setting (AFAIK).
Re: [DISCUSS] The future of CREATE INDEX
if we didn't have copious amounts of (not all public, I know, working on it) evidenceIf that’s the assumption on which this proposal is based, let’s discuss the evidence base first, as given the fundamentally different way they work (almost diametrically opposite), I would want to see a very high quality of evidence to support the claim.I don’t think we can resolve this conversation effectively until this question is settled.On 12 May 2023, at 16:19, Caleb Rackliffe wrote:> This creates huge headaches for everyone successfully using 2i today though, and SAI *is not* guaranteed to perform as well or better - it has a very different performance profile.We wouldn't have even advanced it to this point if we didn't have copious amounts of (not all public, I know, working on it) evidence it did for the vast majority of workloads. Having said that, I don't strongly agree that we should make it the default in 5.0, because performance isn't the only concern. (correctness, DDL back-compat, which we've sort of touched w/ the YAML default option, etc.)This conversation is now going in like 3 different directions, or at least 3 different "packages" of ideas, so there isn't even a single thing to vote on. Let me read through again and try to distill into something that we might be able to do so with...On Fri, May 12, 2023 at 7:56 AM Aleksey Yeshchenko <alek...@apple.com> wrote:This.I would also consider adding CREATE LEGACY INDEX syntax as an alias for today’s CREATE INDEX, the latter to be deprecated and (in very distant future) removed.On 12 May 2023, at 13:14, Benedict <bened...@apache.org> wrote:This creates huge headaches for everyone successfully using 2i today though, and SAI *is not* guaranteed to perform as well or better - it has a very different performance profile.I think we should deprecate CREATE INDEX, and introduce new syntax CREATE LOCAL INDEX to make clear that this is not a global index, and that this should require the USING syntax to avoid this problem in future. We should report warnings to the client when CREATE INDEX is used, indicating it is deprecated.
Re: [DISCUSS] The future of CREATE INDEX
A table is not a local concept at all, it has a global primary index - that’s the core idea of Cassandra.I agree with Brandon that changing CQL behaviour like this based on node config is really not ideal. New syntax is by far the simplest and safest solution to this IMO. It doesn’t have to use the word LOCAL, but I think that’s anyway an improvement, personally. In future we will hopefully offer GLOBAL indexes, and IMO it is better to reify the distinction in the syntax.On 12 May 2023, at 17:29, Caleb Rackliffe wrote:We don't need to know everything about SAI's performance profile to plan and execute some small, reasonable things now for 5.0. I'm going to try to summarize the least controversial package of ideas from the discussion above. I've left out creating any new syntax. For example, I think CREATE LOCAL INDEX, while explicit, is just not necessary. We don't use CREATE LOCAL TABLE, although it has the same locality as our indexes.Okay, so the proposal for 5.0...1.) Add a YAML option that specifies a default implementation for CREATE INDEX, and make this the legacy 2i for now. No existing DDL breaks. We don't have to commit to the absolute superiority of SAI.2.) Add USING...WITH... support to CREATE INDEX, so we don't have to go to market using CREATE CUSTOM INDEX, which feels...not so polished. (The backend for this already exists w/ CREATE CUSTOM INDEX.)3.) Leave in place but deprecate (client warnings could work?) CREATE CUSTOM INDEX. Support the syntax for the foreseeable future.Can we live w/ this?I don't think any information about SAI we could possibly acquire before a 5.0 release would affect the reasonableness of this much.On Fri, May 12, 2023 at 10:54 AM Benedict <bened...@apache.org> wrote:if we didn't have copious amounts of (not all public, I know, working on it) evidenceIf that’s the assumption on which this proposal is based, let’s discuss the evidence base first, as given the fundamentally different way they work (almost diametrically opposite), I would want to see a very high quality of evidence to support the claim.I don’t think we can resolve this conversation effectively until this question is settled.On 12 May 2023, at 16:19, Caleb Rackliffe <calebrackli...@gmail.com> wrote:> This creates huge headaches for everyone successfully using 2i today though, and SAI *is not* guaranteed to perform as well or better - it has a very different performance profile.We wouldn't have even advanced it to this point if we didn't have copious amounts of (not all public, I know, working on it) evidence it did for the vast majority of workloads. Having said that, I don't strongly agree that we should make it the default in 5.0, because performance isn't the only concern. (correctness, DDL back-compat, which we've sort of touched w/ the YAML default option, etc.)This conversation is now going in like 3 different directions, or at least 3 different "packages" of ideas, so there isn't even a single thing to vote on. Let me read through again and try to distill into something that we might be able to do so with...On Fri, May 12, 2023 at 7:56 AM Aleksey Yeshchenko <alek...@apple.com> wrote:This.I would also consider adding CREATE LEGACY INDEX syntax as an alias for today’s CREATE INDEX, the latter to be deprecated and (in very distant future) removed.On 12 May 2023, at 13:14, Benedict <bened...@apache.org> wrote:This creates huge headaches for everyone successfully using 2i today though, and SAI *is not* guaranteed to perform as well or better - it has a very different performance profile.I think we should deprecate CREATE INDEX, and introduce new syntax CREATE LOCAL INDEX to make clear that this is not a global index, and that this should require the USING syntax to avoid this problem in future. We should report warnings to the client when CREATE INDEX is used, indicating it is deprecated.
Re: [DISCUSS] The future of CREATE INDEX
If the performance characteristics are as clear cut as you think, then maybe it will be an easy decision once the evidence is available for everyone to consider?If not, then we probably can’t do the hard cutover and so the answer is still pretty simple? On 12 May 2023, at 18:04, Caleb Rackliffe wrote:I don't particularly like the YAML solution either, but absent that, we're back to fighting about whether we introduce entirely new syntax or hard cut over to SAI at some point.We already have per-node configuration in the YAML that determines whether or not we can create a 2i at all, right?What if we just do #2 and #3 and punt on everything else?On Fri, May 12, 2023 at 11:56 AM Benedict <bened...@apache.org> wrote:A table is not a local concept at all, it has a global primary index - that’s the core idea of Cassandra.I agree with Brandon that changing CQL behaviour like this based on node config is really not ideal. New syntax is by far the simplest and safest solution to this IMO. It doesn’t have to use the word LOCAL, but I think that’s anyway an improvement, personally. In future we will hopefully offer GLOBAL indexes, and IMO it is better to reify the distinction in the syntax.On 12 May 2023, at 17:29, Caleb Rackliffe <calebrackli...@gmail.com> wrote:We don't need to know everything about SAI's performance profile to plan and execute some small, reasonable things now for 5.0. I'm going to try to summarize the least controversial package of ideas from the discussion above. I've left out creating any new syntax. For example, I think CREATE LOCAL INDEX, while explicit, is just not necessary. We don't use CREATE LOCAL TABLE, although it has the same locality as our indexes.Okay, so the proposal for 5.0...1.) Add a YAML option that specifies a default implementation for CREATE INDEX, and make this the legacy 2i for now. No existing DDL breaks. We don't have to commit to the absolute superiority of SAI.2.) Add USING...WITH... support to CREATE INDEX, so we don't have to go to market using CREATE CUSTOM INDEX, which feels...not so polished. (The backend for this already exists w/ CREATE CUSTOM INDEX.)3.) Leave in place but deprecate (client warnings could work?) CREATE CUSTOM INDEX. Support the syntax for the foreseeable future.Can we live w/ this?I don't think any information about SAI we could possibly acquire before a 5.0 release would affect the reasonableness of this much.On Fri, May 12, 2023 at 10:54 AM Benedict <bened...@apache.org> wrote:if we didn't have copious amounts of (not all public, I know, working on it) evidenceIf that’s the assumption on which this proposal is based, let’s discuss the evidence base first, as given the fundamentally different way they work (almost diametrically opposite), I would want to see a very high quality of evidence to support the claim.I don’t think we can resolve this conversation effectively until this question is settled.On 12 May 2023, at 16:19, Caleb Rackliffe <calebrackli...@gmail.com> wrote:> This creates huge headaches for everyone successfully using 2i today though, and SAI *is not* guaranteed to perform as well or better - it has a very different performance profile.We wouldn't have even advanced it to this point if we didn't have copious amounts of (not all public, I know, working on it) evidence it did for the vast majority of workloads. Having said that, I don't strongly agree that we should make it the default in 5.0, because performance isn't the only concern. (correctness, DDL back-compat, which we've sort of touched w/ the YAML default option, etc.)This conversation is now going in like 3 different directions, or at least 3 different "packages" of ideas, so there isn't even a single thing to vote on. Let me read through again and try to distill into something that we might be able to do so with...On Fri, May 12, 2023 at 7:56 AM Aleksey Yeshchenko <alek...@apple.com> wrote:This.I would also consider adding CREATE LEGACY INDEX syntax as an alias for today’s CREATE INDEX, the latter to be deprecated and (in very distant future) removed.On 12 May 2023, at 13:14, Benedict <bened...@apache.org> wrote:This creates huge headaches for everyone successfully using 2i today though, and SAI *is not* guaranteed to perform as well or better - it has a very different performance profile.I think we should deprecate CREATE INDEX, and introduce new syntax CREATE LOCAL INDEX to make clear that this is not a global index, and that this should require the USING syntax to avoid this problem in future. We should report warnings to the client when CREATE INDEX is used, indicating it is deprecated.
Re: [DISCUSS] The future of CREATE INDEX
I still prefer introducing CREATE LOCAL INDEX, to help users understand the semantics of the index they’re creating.I think it will in future potentially be quite confusing to be able to create global and local indexes using the same DDL statement.But, depending on appetite, that could plausibly be done in future instead.(I don’t endorse the assumption of a future switch of default)On 12 May 2023, at 18:18, Caleb Rackliffe wrote:So the weakest version of the plan that actually accomplishes something useful for 5.0:1.) Just leave the CREATE INDEX default alone for now. Hard switch the default after 5.0.2.) Add USING...WITH... support to CREATE INDEX, so we don't have to go to market using CREATE CUSTOM INDEX, which feels...not so polished. (The backend for this already exists w/ CREATE CUSTOM INDEX.)3.) Leave in place but deprecate (client warnings could work?) CREATE CUSTOM INDEX. Support the syntax for the foreseeable future.Any objections to that?On Fri, May 12, 2023 at 12:10 PM Caleb Rackliffe <calebrackli...@gmail.com> wrote:I don't want to cut over for 5.0 either way. I was more contrasting a configurable cutover in 5.0 vs. a hard cutover later.On Fri, May 12, 2023 at 12:09 PM Benedict <bened...@apache.org> wrote:If the performance characteristics are as clear cut as you think, then maybe it will be an easy decision once the evidence is available for everyone to consider?If not, then we probably can’t do the hard cutover and so the answer is still pretty simple? On 12 May 2023, at 18:04, Caleb Rackliffe <calebrackli...@gmail.com> wrote:I don't particularly like the YAML solution either, but absent that, we're back to fighting about whether we introduce entirely new syntax or hard cut over to SAI at some point.We already have per-node configuration in the YAML that determines whether or not we can create a 2i at all, right?What if we just do #2 and #3 and punt on everything else?On Fri, May 12, 2023 at 11:56 AM Benedict <bened...@apache.org> wrote:A table is not a local concept at all, it has a global primary index - that’s the core idea of Cassandra.I agree with Brandon that changing CQL behaviour like this based on node config is really not ideal. New syntax is by far the simplest and safest solution to this IMO. It doesn’t have to use the word LOCAL, but I think that’s anyway an improvement, personally. In future we will hopefully offer GLOBAL indexes, and IMO it is better to reify the distinction in the syntax.On 12 May 2023, at 17:29, Caleb Rackliffe <calebrackli...@gmail.com> wrote:We don't need to know everything about SAI's performance profile to plan and execute some small, reasonable things now for 5.0. I'm going to try to summarize the least controversial package of ideas from the discussion above. I've left out creating any new syntax. For example, I think CREATE LOCAL INDEX, while explicit, is just not necessary. We don't use CREATE LOCAL TABLE, although it has the same locality as our indexes.Okay, so the proposal for 5.0...1.) Add a YAML option that specifies a default implementation for CREATE INDEX, and make this the legacy 2i for now. No existing DDL breaks. We don't have to commit to the absolute superiority of SAI.2.) Add USING...WITH... support to CREATE INDEX, so we don't have to go to market using CREATE CUSTOM INDEX, which feels...not so polished. (The backend for this already exists w/ CREATE CUSTOM INDEX.)3.) Leave in place but deprecate (client warnings could work?) CREATE CUSTOM INDEX. Support the syntax for the foreseeable future.Can we live w/ this?I don't think any information about SAI we could possibly acquire before a 5.0 release would affect the reasonableness of this much.On Fri, May 12, 2023 at 10:54 AM Benedict <bened...@apache.org> wrote:if we didn't have copious amounts of (not all public, I know, working on it) evidenceIf that’s the assumption on which this proposal is based, let’s discuss the evidence base first, as given the fundamentally different way they work (almost diametrically opposite), I would want to see a very high quality of evidence to support the claim.I don’t think we can resolve this conversation effectively until this question is settled.On 12 May 2023, at 16:19, Caleb Rackliffe <calebrackli...@gmail.com> wrote:> This creates huge headaches for everyone successfully using 2i today though, and SAI *is not* guaranteed to perform as well or better - it has a very different performance profile.We wouldn't have even advanced it to this point if we didn't have copious amounts of (not all public, I know, working on it) evidence it did for the vast majority of workloads. Having said that, I don't strongly agree that we should make it the default in 5.0, because performance isn't the only concern. (correctness, DDL back-compat, which we've sort of touched w/ the YAML default option, etc.)This conversation is now g
Re: [DISCUSS] The future of CREATE INDEX
I’m not convinced a default index makes any sense, no. The trade-offs in a distributed setting are much more pronounced.Indexes in a local-only RDBMS are much simpler affairs; the trade offs are much more subtle than here. On 12 May 2023, at 18:24, Caleb Rackliffe wrote:> Now, giving this thread, there is pushback for a config to allow default impl to change… but there is 0 pushback for new syntax to make this explicit…. So maybe we should [POLL] for what syntax people want?I think the essential question is whether we want the concept of a default index. If we do, we need to figure that out now. If we don't then a new syntax that forces it becomes interesting.Given it seems most DBs have a default index (see Postgres, etc.), I tend to lean toward having one, but that's me...On Fri, May 12, 2023 at 12:20 PM David Capwell <dcapw...@apple.com> wrote:I really dislike the idea of the same CQL doing different things based upon a per-node configuration.I agree with Brandon that changing CQL behaviour like this based on node config is really not ideal. I am cool adding such a config, and also cool keeping CREATE INDEX disabled by default…. But would like to point out that we have many configs that impact CQL and they are almost always local configs…Is CREATE INDEX even allowed? This is a per node config. Right now you can block globally, enable on a single instance, create the index for your users, then revert the config change on the instance…. All guardrails that define what we can do are per node configs…Now, giving this thread, there is pushback for a config to allow default impl to change… but there is 0 pushback for new syntax to make this explicit…. So maybe we should [POLL] for what syntax people want?if we decide before the 5.0 release that we have enough information to change the default (#1), we can change it in a matter of minutes.I am strongly against this… SAI is new for 5.0 so should be disabled by default; else we disrespect the idea that new features are disabled by default. I am cool with our docs recommending if we do find its better in most cases, but we should not change the default in the same reason it lands in.On May 12, 2023, at 10:10 AM, Caleb Rackliffe <calebrackli...@gmail.com> wrote:I don't want to cut over for 5.0 either way. I was more contrasting a configurable cutover in 5.0 vs. a hard cutover later.On Fri, May 12, 2023 at 12:09 PM Benedict <bened...@apache.org> wrote:If the performance characteristics are as clear cut as you think, then maybe it will be an easy decision once the evidence is available for everyone to consider?If not, then we probably can’t do the hard cutover and so the answer is still pretty simple? On 12 May 2023, at 18:04, Caleb Rackliffe <calebrackli...@gmail.com> wrote:I don't particularly like the YAML solution either, but absent that, we're back to fighting about whether we introduce entirely new syntax or hard cut over to SAI at some point.We already have per-node configuration in the YAML that determines whether or not we can create a 2i at all, right?What if we just do #2 and #3 and punt on everything else?On Fri, May 12, 2023 at 11:56 AM Benedict <bened...@apache.org> wrote:A table is not a local concept at all, it has a global primary index - that’s the core idea of Cassandra.I agree with Brandon that changing CQL behaviour like this based on node config is really not ideal. New syntax is by far the simplest and safest solution to this IMO. It doesn’t have to use the word LOCAL, but I think that’s anyway an improvement, personally. In future we will hopefully offer GLOBAL indexes, and IMO it is better to reify the distinction in the syntax.On 12 May 2023, at 17:29, Caleb Rackliffe <calebrackli...@gmail.com> wrote:We don't need to know everything about SAI's performance profile to plan and execute some small, reasonable things now for 5.0. I'm going to try to summarize the least controversial package of ideas from the discussion above. I've left out creating any new syntax. For example, I think CREATE LOCAL INDEX, while explicit, is just not necessary. We don't use CREATE LOCAL TABLE, although it has the same locality as our indexes.Okay, so the proposal for 5.0...1.) Add a YAML option that specifies a default implementation for CREATE INDEX, and make this the legacy 2i for now. No existing DDL breaks. We don't have to commit to the absolute superiority of SAI.2.) Add USING...WITH... support to CREATE INDEX, so we don't have to go to market using CREATE CUSTOM INDEX, which feels...not so polished. (The backend for this already exists w/ CREATE CUSTOM INDEX.)3.) Leave in place but deprecate (client warnings could work?) CREATE CUSTOM INDEX. Support the syntax for the foreseeable future.Can we live w/ this?I don't think any information about SAI we could possibly acquire before a 5.0 release would affect the reasonableness of this much.On Fri, May
Re: [DISCUSS] The future of CREATE INDEX
There remains the question of what the new syntax is - whether it augments CREATE INDEX to replace CREATE CUSTOM INDEX or if we introduce new syntax because we think it’s clearer.I can accept settling for modifying CREATE INDEX … USING, but I maintain that CREATE LOCAL INDEX is betterOn 12 May 2023, at 18:31, Caleb Rackliffe wrote:Even if we don't want to allow a default, we can keep the same CREATE INDEX syntax in place, and have a guardrail forcing (or not) the selection of an implementation, right? This would be no worse than the YAML option we already have for enabling 2i creation as a whole.On Fri, May 12, 2023 at 12:28 PM Benedict <bened...@apache.org> wrote:I’m not convinced a default index makes any sense, no. The trade-offs in a distributed setting are much more pronounced.Indexes in a local-only RDBMS are much simpler affairs; the trade offs are much more subtle than here. On 12 May 2023, at 18:24, Caleb Rackliffe <calebrackli...@gmail.com> wrote:> Now, giving this thread, there is pushback for a config to allow default impl to change… but there is 0 pushback for new syntax to make this explicit…. So maybe we should [POLL] for what syntax people want?I think the essential question is whether we want the concept of a default index. If we do, we need to figure that out now. If we don't then a new syntax that forces it becomes interesting.Given it seems most DBs have a default index (see Postgres, etc.), I tend to lean toward having one, but that's me...On Fri, May 12, 2023 at 12:20 PM David Capwell <dcapw...@apple.com> wrote:I really dislike the idea of the same CQL doing different things based upon a per-node configuration.I agree with Brandon that changing CQL behaviour like this based on node config is really not ideal. I am cool adding such a config, and also cool keeping CREATE INDEX disabled by default…. But would like to point out that we have many configs that impact CQL and they are almost always local configs…Is CREATE INDEX even allowed? This is a per node config. Right now you can block globally, enable on a single instance, create the index for your users, then revert the config change on the instance…. All guardrails that define what we can do are per node configs…Now, giving this thread, there is pushback for a config to allow default impl to change… but there is 0 pushback for new syntax to make this explicit…. So maybe we should [POLL] for what syntax people want?if we decide before the 5.0 release that we have enough information to change the default (#1), we can change it in a matter of minutes.I am strongly against this… SAI is new for 5.0 so should be disabled by default; else we disrespect the idea that new features are disabled by default. I am cool with our docs recommending if we do find its better in most cases, but we should not change the default in the same reason it lands in.On May 12, 2023, at 10:10 AM, Caleb Rackliffe <calebrackli...@gmail.com> wrote:I don't want to cut over for 5.0 either way. I was more contrasting a configurable cutover in 5.0 vs. a hard cutover later.On Fri, May 12, 2023 at 12:09 PM Benedict <bened...@apache.org> wrote:If the performance characteristics are as clear cut as you think, then maybe it will be an easy decision once the evidence is available for everyone to consider?If not, then we probably can’t do the hard cutover and so the answer is still pretty simple? On 12 May 2023, at 18:04, Caleb Rackliffe <calebrackli...@gmail.com> wrote:I don't particularly like the YAML solution either, but absent that, we're back to fighting about whether we introduce entirely new syntax or hard cut over to SAI at some point.We already have per-node configuration in the YAML that determines whether or not we can create a 2i at all, right?What if we just do #2 and #3 and punt on everything else?On Fri, May 12, 2023 at 11:56 AM Benedict <bened...@apache.org> wrote:A table is not a local concept at all, it has a global primary index - that’s the core idea of Cassandra.I agree with Brandon that changing CQL behaviour like this based on node config is really not ideal. New syntax is by far the simplest and safest solution to this IMO. It doesn’t have to use the word LOCAL, but I think that’s anyway an improvement, personally. In future we will hopefully offer GLOBAL indexes, and IMO it is better to reify the distinction in the syntax.On 12 May 2023, at 17:29, Caleb Rackliffe <calebrackli...@gmail.com> wrote:We don't need to know everything about SAI's performance profile to plan and execute some small, reasonable things now for 5.0. I'm going to try to summarize the least controversial package of ideas from the discussion above. I've left out creating any new syntax. For example, I think CREATE LOCAL INDEX, while explicit, is just not necessary. We don't use CREATE LOCAL TABLE, although it has the same locality as our indexes.Okay, so the propos
Re: [DISCUSS] The future of CREATE INDEX
If folk should be reading up on the index type, doesn’t that conflict with your support of a default? Should there be different global and local defaults, once we have global indexes, or should we always default to a local index? Or a global one? > On 12 May 2023, at 18:39, Mick Semb Wever wrote: > > >> >> Given it seems most DBs have a default index (see Postgres, etc.), I tend to >> lean toward having one, but that's me... > > > I'm for it too. Would be nice to enforce the setting is globally uniform to > avoid the per-node problem. Or add a keyspace option. > > For users replaying <5 DDLs this would just require they set the default > index to 2i. > This is not a headache, it's a one-off action that can be clearly expressed > in NEWS. > It acts as a deprecation warning too. > This prevents new uneducated users from creating the unintended index, it > supports existing users, and it does not present SAI as the battle-tested > default. > > Agree with the poll, there's a number of different PoVs here already. I'm > not fond of the LOCAL addition, I appreciate what it informs, but it's just > not important enough IMHO (folk should be reading up on the index type).
Re: [DISCUSS] The future of CREATE INDEX
But then we have to reconsider the existing syntax, or do we want LOCAL to be the default?We should be planning our language evolution along with our feature evolution.On 12 May 2023, at 19:28, Caleb Rackliffe wrote:If at some point in the glorious future we have global indexes, I'm sure we can add GLOBAL to the syntax...sry, working on an ugly poll...On Fri, May 12, 2023 at 1:24 PM Benedict <bened...@apache.org> wrote:If folk should be reading up on the index type, doesn’t that conflict with your support of a default?Should there be different global and local defaults, once we have global indexes, or should we always default to a local index? Or a global one?On 12 May 2023, at 18:39, Mick Semb Wever <m...@apache.org> wrote:Given it seems most DBs have a default index (see Postgres, etc.), I tend to lean toward having one, but that's me... I'm for it too. Would be nice to enforce the setting is globally uniform to avoid the per-node problem. Or add a keyspace option. For users replaying <5 DDLs this would just require they set the default index to 2i.This is not a headache, it's a one-off action that can be clearly expressed in NEWS.It acts as a deprecation warning too.This prevents new uneducated users from creating the unintended index, it supports existing users, and it does not present SAI as the battle-tested default.Agree with the poll, there's a number of different PoVs here already. I'm not fond of the LOCAL addition, I appreciate what it informs, but it's just not important enough IMHO (folk should be reading up on the index type).
Re: [DISCUSS] The future of CREATE INDEX
Given we have no data in front of us to make a decision regarding switching defaults, I don’t think it is suitable to include that option in this poll. In fact, until we have sufficient data to discuss that I’m going to put a hard veto on that on technical grounds.On 12 May 2023, at 19:41, Caleb Rackliffe wrote:...and to clarify, answers should be what you'd like to see for 5.0 specificallyOn Fri, May 12, 2023 at 1:36 PM Caleb Rackliffewrote:[POLL] Centralize existing syntax or create new syntax?1.) CREATE INDEX ... USING WITH OPTIONS...2.) CREATE LOCAL INDEX ... USING ... WITH OPTIONS... (same as 1, but adds LOCAL keyword for clarity and separation from future GLOBAL indexes)(In both cases, we deprecate w/ client warnings CREATE CUSTOM INDEX)[POLL] Should there be a default? (YES/NO)[POLL] What do do with the default?1.) Allow a default, and switch it to SAI (no configurables)2.) Allow a default, and stay w/ the legacy 2i (no configurables)3.) YAML config to override default index (legacy 2i remains the default)4.) YAML config/guardrail to require index type selection (not required by default)On Fri, May 12, 2023 at 12:39 PM Mick Semb Wever wrote:Given it seems most DBs have a default index (see Postgres, etc.), I tend to lean toward having one, but that's me... I'm for it too. Would be nice to enforce the setting is globally uniform to avoid the per-node problem. Or add a keyspace option. For users replaying <5 DDLs this would just require they set the default index to 2i.This is not a headache, it's a one-off action that can be clearly expressed in NEWS.It acts as a deprecation warning too.This prevents new uneducated users from creating the unintended index, it supports existing users, and it does not present SAI as the battle-tested default.Agree with the poll, there's a number of different PoVs here already. I'm not fond of the LOCAL addition, I appreciate what it informs, but it's just not important enough IMHO (folk should be reading up on the index type).
Re: [DISCUSS] The future of CREATE INDEX
3: CREATE INDEX (Otherwise 2)NoIf configurable, should be a distributed configuration. This is very different to other local configurations, as the 2i selected has semantic implications, not just performance (and the perf implications are also much greater)On 15 May 2023, at 10:45, Mike Adamson wrote:[POLL] Centralize existing syntax or create new syntax?1.) CREATE INDEX ... USING WITH OPTIONS...2.) CREATE LOCAL INDEX ... USING ... WITH OPTIONS... (same as 1, but adds LOCAL keyword for clarity and separation from future GLOBAL indexes) 1.) CREATE INDEX ... USING WITH OPTIONS...[POLL] Should there be a default? (YES/NO)Yes[POLL] What do do with the default?1.) Allow a default, and switch it to SAI (no configurables)2.) Allow a default, and stay w/ the legacy 2i (no configurables)3.) YAML config to override default index (legacy 2i remains the default)4.) YAML config/guardrail to require index type selection (not required by default)3.) YAML config to override default index (legacy 2i remains the default)On Mon, 15 May 2023 at 08:54, Mick Semb Weverwrote:[POLL] Centralize existing syntax or create new syntax?1.) CREATE INDEX ... USING WITH OPTIONS...2.) CREATE LOCAL INDEX ... USING ... WITH OPTIONS... (same as 1, but adds LOCAL keyword for clarity and separation from future GLOBAL indexes)(1) CREATE INDEX … [POLL] Should there be a default? (YES/NO)Yes (but see below). [POLL] What do do with the default?1.) Allow a default, and switch it to SAI (no configurables)2.) Allow a default, and stay w/ the legacy 2i (no configurables)3.) YAML config to override default index (legacy 2i remains the default)4.) YAML config/guardrail to require index type selection (not required by default)(4) YAML config. Commented out default of 2i.I agree that the default cannot change in 5.0, but our existing default of 2i can be commented out.For the user this gives them the same feedback, and puts the same requirement to edit one line of yaml, as when we disabled MVs and SASI in 4.0No one has complained about either of these, which is a clear signal folk understood how to get their existing DDLs to work from 3.x to 4.x -- Mike AdamsonEngineering+1 650 389 6000 | datastax.comFind DataStax Online:
Re: [DISCUSS] Feature branch version hygiene
Copying my rely on the ticket… We have this discussion roughly once per major. If you look back through dev@ you'll find the last one a few years back. I don't recall NA ever being the approved approach, though. ".x" lines are target versions, whereas concrete versions are the ones a fix landed in. There's always ambiguity over the next release, as it's sort of both. But since there is no 5.0 version, only 5.0-alphaN, 5.0-betaN and 5.0.0, perhaps 5.0 is the correct label (and makes sense to me). I forget what we landed upon last time. Work that has actually landed should probably be labelled as 5.0-alpha1 > On 16 May 2023, at 21:02, J. D. Jordan wrote: > > > Process question/discussion. Should tickets that are merged to CEP feature > branches, like https://issues.apache.org/jira/browse/CASSANDRA-18204, have a > fixver of 5.0 on them After merging to the feature branch? > > For the SAI CEP which is also using the feature branch method the "reviewed > and merged to feature branch" tickets seem to be given a version of NA. > > Not sure that's the best “waiting for cep to merge” version either? But it > seems better than putting 5.0 on them to me. > > Why I’m not keen on 5.0 is because if we cut the release today those tickets > would not be there. > > What do other people think? Is there a better version designation we can use? > > On a different project I have in the past made a “version number” in JIRA for > each long running feature branch. Tickets merged to the feature branch got > the epic ticket number as their version, and then it got updated to the > “real” version when the feature branch was merged to trunk. > > -Jeremiah
Re: [DISCUSS] Feature branch version hygiene
I don’t think we should over complicate this with special CEP release targets. If we do, they shouldn’t be versioned.My personal view is that 5.0 should not be used for any resolved tickets - they should go to 5.0-alpha1, since this is the correct release for them. 5.0 can then be the target version, which makes more sense given it isn’t a concrete release.But, in lieu of that, every ticket targeting 5.0 could use fixVersion 5.0.x, since it is pretty clear what this means. Some tickets that don’t hit 5.0.0 can then be postponed to a later version, but it’s not like this is burdensome. Anything marked feature/improvement and 5.0.x gets bumped to 5.1.x.On 18 May 2023, at 13:58, Josh McKenzie wrote:CEP-N seems like a good compromise. NextMajorRelease bumps into our interchangeable use of "Major" and "Minor" from a semver perspective and could get confusing. Suppose we could do NextFeatureRelease, but at that point why not just have it linked to the CEP and have the epic set.On Thu, May 18, 2023, at 12:26 AM, Caleb Rackliffe wrote:...otherwise I'm fine w/ just the CEP name, like "CEP-7" for SAI, etc.On Wed, May 17, 2023 at 11:24 PM Caleb Rackliffewrote:So when a CEP slips, do we have to create a 5.1-cep-N? Could we just have a version that's "NextMajorRelease" or something like that? It should still be pretty easy to bulk replace if we have something else to filter on, like belonging to an epic?On Wed, May 17, 2023 at 6:42 PM Mick Semb Wever wrote:On Tue, 16 May 2023 at 13:02, J. D. Jordan wrote:Process question/discussion. Should tickets that are merged to CEP feature branches, like https://issues.apache.org/jira/browse/CASSANDRA-18204, have a fixver of 5.0 on them After merging to the feature branch?For the SAI CEP which is also using the feature branch method the "reviewed and merged to feature branch" tickets seem to be given a version of NA.Not sure that's the best “waiting for cep to merge” version either? But it seems better than putting 5.0 on them to me.Why I’m not keen on 5.0 is because if we cut the release today those tickets would not be there.What do other people think? Is there a better version designation we can use?On a different project I have in the past made a “version number” in JIRA for each long running feature branch. Tickets merged to the feature branch got the epic ticket number as their version, and then it got updated to the “real” version when the feature branch was merged to trunk.Thanks for raising the thread, I remember there was some confusion early wrt features branches too.To rehash, for everything currently resolved in trunk 5.0 is the correct fixVersion. (And there should be no unresolved issues today with 5.0 fixVersion, they should be 5.x)When alpha1 is cut, then the 5.0-alpha1 fixVersion is created and everything with 5.0 also gets 5.0-alpha1. At the same time 5.0-alpha2, 5.0-beta, 5.0-rc, 5.0.0 fixVersions are created. Here both 5.0-beta and 5.0-rc are blocking placeholder fixVersions: no resolved issues are left with this fixVersion the same as the .x placeholder fixVersions. The 5.0.0 is also used as a blocking version, though it is also an eventual fixVersion for resolved tickets. Also note, all tickets up to and including 5.0.0 will also have the 5.0 fixVersion.A particular reason for doing things the way they are is to make it easy for the release manager to bulk correct fixVersions, at release time or even later, i.e. without having to read the ticket or go talk to authors or painstakingly crawl CHANGES.txt.For feature branches my suggestion is that we create a fixVersion for each of them, e.g. 5.0-cep-15Yup, that's your suggestion Jeremiah (I wrote this up on the plane before I got to read your post properly).(As you say) This then makes it easy to see where the code is (or what the patch is currently being based on). And when the feature branch is merged then it is easy to bulk replace it with trunk's fixVersion, e.g. 5.0-cep-15 with 5.0The NA fixVersion was introduced for the other repositories, e.g. website updates.
Re: [DISCUSS] Feature branch version hygiene
So we just rename alpha1 to beta1 if that happens? Or, we point resolved tickets straight to 5.0.0, and add 5.0-alpha1 to any tickets with *only* 5.0.0 This is probably the easiest for folk to understand when browsing. Finding new features is easy either way - look for 5.0.0. > On 18 May 2023, at 15:08, Mick Semb Wever wrote: > > > > >> So when a CEP slips, do we have to create a 5.1-cep-N? > > > No, you'd just rename it, easy to do in just one place. > I really don't care, but the version would at least helps indicate what the > branch is getting rebased off. > > > >> My personal view is that 5.0 should not be used for any resolved tickets - >> they should go to 5.0-alpha1, since this is the correct release for them. >> 5.0 can then be the target version, which makes more sense given it isn’t a >> concrete release. > > > Each time, we don't know if the first release will be an alpha1 or if we're > confident enough to go straight to a beta1. > A goal with stable trunk would make the latter possible. > > And with the additional 5.0 label has been requested by a few to make it easy > to search for what's new, this has been the simplest way. >
Re: [DISCUSS] Feature branch version hygiene
The .x approach only breaks down for unreleased majors, for which all of our intuitions breakdown and we rehash it every year.My mental model, though, is that anything that’s not a concrete release number is a target version. Which is where 5.0 goes wrong - it’s not a release so it should be a target, but for some reason we use it as a placeholder to park work arriving in 5.0.0.If we instead use 5.0.0 for this purpose, we just need to get 5.0-alpha1 labels added when those releases are cut.Then I propose we break the confusion in both directions by scrapping 5.0 entirely and introducing 5.0-target.So tickets go to 5.0-target if they target 5.0, and to 5.0.0 once they are resolved (with additional labels as necessary)Simples?On 18 May 2023, at 15:21, Josh McKenzie wrote:My personal view is that 5.0 should not be used for any resolved tickets - they should go to 5.0-alpha1, since this is the correct release for them. 5.0 can then be the target version, which makes more sense given it isn’t a concrete release.Well now you're just opening Pandora's box about our strange idioms with FixVersion usage. ;)every ticket targeting 5.0 could use fixVersion 5.0.x, since it is pretty clear what this means.I think this diverges from our current paradigm where "5.x" == next feature release, "5.0.x" == next patch release (i.e. bugfix only). Not to say it's bad, just an adjustment... which if we're open to adjustment...I'm receptive to transitioning the discussion to that either on this thread or another; IMO we remain in a strange and convoluted place with our FixVersioning. My understanding of our current practice:.x is used to denote target version. For example: 5.x, 5.0.x, 5.1.x, 4.1.xWhen a ticket is committed, the FixVersion is transitioned to resolve the X to the next unreleased version in which it'll releaseWeird Things are done to make this work for the release process and release manager on feature releases (alpha, beta, etc)There's no clear fit for feature branch tickets in the above schemaAnd if I take what I think you're proposing here and extrapolate it out:.0 is used to denote target version. For example: 5.0. 5.0.0. 5.1.0. 4.1.0This appears to break down for patch releases: we _do_ release .0 versions of them rather than alpha/beta/etc, so a ticket targeting 4.1.0 would initially mean 2 different things based on resolved vs. unresolved status (resolved == in release, unresolved == targeting next unreleased) and that distinction would disappear on resolution (i.e. resolved + 4.1.0 would no longer definitively mean "contained in .0 release")When a release is cut, we bulk update FixVersion ending in .0 to the release version in which they're contained (not clear how to disambiguate the things from the above bullet point)For feature releases, .0 will transition to -alpha1One possible solution would be to just no longer release a .0 version of things and reserve .0 to indicate "parked". I don't particularly like that but it's not the worst.Another possible solution would be to just scrap this approach entirely and go with:FixVersion on unreleased _and still advocated for tickets_ always targets the next unreleased version. For other tickets where nobody is advocating for their work / inclusion, we either FixVersion "Backlog" or close as "Later"When a release is cut, roll all unresolved tickets w/that FixVersion to the next unreleased FixVersionWhen we're gearing up to a release, we can do a broad pass on everything that's unreleased w/the next feature releases FixVersion and move tickets that are desirable but not blockers to the next unreleased FixVersion (patch for bug, minor/major for improvements or new features)CEP tickets target the same FixVersion (i.e. next unreleased Feature release) as their parents. When the parent epic gets a new FixVersion on resolution, all children get that FixVersion (i.e. when we merge the CEP and update its FixVersion, we bulk update all children tickets)On Thu, May 18, 2023, at 9:08 AM, Benedict wrote:I don’t think we should over complicate this with special CEP release targets. If we do, they shouldn’t be versioned.My personal view is that 5.0 should not be used for any resolved tickets - they should go to 5.0-alpha1, since this is the correct release for them. 5.0 can then be the target version, which makes more sense given it isn’t a concrete release.But, in lieu of that, every ticket targeting 5.0 could use fixVersion 5.0.x, since it is pretty clear what this means. Some tickets that don’t hit 5.0.0 can then be postponed to a later version, but it’s not like this is burdensome. Anything marked feature/improvement and 5.0.x gets bumped to 5.1.x.On 18 May 2023, at 13:58, Josh McKenzie wrote:CEP-N seems like a good compromise. NextMajorRelease bumps into our interchangeable use of "Major" and "Minor" from a semver perspective a
Re: [DISCUSS] Bring cassandra-harry in tree as a submodule
It’s not without hiccups, and I’m sure we have more to learn. But it mostly just works, and importantly it’s a million times better than the dtest-api process - which stymies development due to the friction.On 24 May 2023, at 08:39, Mick Semb Wever wrote:WRT git submodules and CASSANDRA-18204, are we happy with how it is working for accord ? The time spent on getting that running has been a fair few hours, where we could have cut many manual module releases in that time. David and folks working on accord ? On Tue, 23 May 2023 at 20:09, Josh McKenziewrote:I'll hold off on this until Alex Petrov chimes in. @Alex -> got any thoughts here?On Tue, May 16, 2023, at 5:17 PM, Jeremy Hanna wrote:I think it would be great to onboard Harry more officially into the project. However it would be nice to perhaps do some sanity checking outside of Apple folks to see how approachable it is. That is, can someone take it and just run it with the current readme without any additional context?I wonder if a mini-onboarding session would be good as a community session - go over Harry, how to run it, how to add a test? Would that be the right venue? I just would like to see how we can not only plug it in to regular CI but get everyone that wants to add a test be able to know how to get started with it.JeremyOn May 16, 2023, at 1:34 PM, Abe Ratnofsky wrote:Just to make sure I'm understanding the details, this would mean apache/cassandra-harry maintains its status as a separate repository, apache/cassandra references it as a submodule, and clones and builds Harry locally, rather than pulling a released JAR. We can then reference Harry as a library without maintaining public artifacts for it. Is that in line with what you're thinking?> I'd also like to see us get a Harry run integrated as part of our pre-commit CII'm a strong supporter of this, of course.On May 16, 2023, at 11:03 AM, Josh McKenzie wrote:Similar to what we've done with accord in https://issues.apache.org/jira/browse/CASSANDRA-18204, I'd like to discuss bringing cassandra-harry in-tree as a submodule. repo link: https://github.com/apache/cassandra-harryGiven the value it's brought to the project's stabilization efforts and the movement of other things in the ecosystem to being more integrated (accord, build-scripts https://issues.apache.org/jira/browse/CASSANDRA-18133), I think having the testing framework better localized and integrated would be a net benefit for adoption, awareness, maintenance, and tighter workflows as we troubleshoot future failures it surfaces.I'd also like to see us get a Harry run integrated as part of our pre-commit CI (a 5 minute simple soak test for instance) and having that local in this fashion should make that a cleaner integration as well.Thoughts?
Re: [DISCUSS] Bring cassandra-harry in tree as a submodule
In this case Harry is a testing module - it’s not something we will develop in tandem with C* releases, and we will want improvements to be applied across all branches.So it seems a natural fit for submodules to me.On 24 May 2023, at 21:09, Caleb Rackliffe wrote:> Submodules do have their own overhead and edge cases, so I am mostly in favor of using for cases where the code must live outside of tree (such as jvm-dtest that lives out of tree as all branches need the same interfaces)Agreed. Basically where I've ended up on this topic.> We could go over some interesting examples such as testing 2i (SAI)+100On Wed, May 24, 2023 at 1:40 PM Alex Petrov <al...@coffeenco.de> wrote:> I'm about to need to harry test for the paging across tombstone work for https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my own overlapping fuzzing came in). In the process, I'll see if I can't distill something really simple along the lines of how React approaches it (https://react.dev/learn).We can pick that up as an example, sure. On Wed, May 24, 2023, at 4:53 PM, Josh McKenzie wrote:I have submitted a proposal to Cassandra Summit for a 4-hour Harry workshop,I'm about to need to harry test for the paging across tombstone work for https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my own overlapping fuzzing came in). In the process, I'll see if I can't distill something really simple along the lines of how React approaches it (https://react.dev/learn).Ideally we'd be able to get something together that's a high level "In the next 15 minutes, you will know and understand A-G and have access to N% of the power of harry" kind of offer.Honestly, there's a lot in our ecosystem where we could benefit from taking a page from their book in terms of onboarding and getting started IMO.On Wed, May 24, 2023, at 10:31 AM, Alex Petrov wrote:> I wonder if a mini-onboarding session would be good as a community session - go over Harry, how to run it, how to add a test? Would that be the right venue? I just would like to see how we can not only plug it in to regular CI but get everyone that wants to add a test be able to know how to get started with it.I have submitted a proposal to Cassandra Summit for a 4-hour Harry workshop, but unfortunately it got declined. Goes without saying, we can still do it online, time and resources permitting. But again, I do not think it should be barring us from making Harry a part of the codebase, as it already is. In fact, we can be iterating on the development quicker having it in-tree. We could go over some interesting examples such as testing 2i (SAI), modelling Group By tests, or testing repair. If there is enough appetite and collaboration in the community, I will see if we can pull something like that together. Input on _what_ you would like to see / hear / tested is also appreciated. Harry was developed out of a strong need for large-scale testing, which also has informed many of its APIs, but we can make it easier to access for interactive testing / unit tests. We have been doing a lot of that with Transactional Metadata, too. > I'll hold off on this until Alex Petrov chimes in. @Alex -> got any thoughts here?Yes, sorry for not responding on this thread earlier. I can not understate how excited I am about this, and how important I think this is. Time constraints are somehow hard to overcome, but I hope the results brought by TCM will make it all worth it.On Wed, May 24, 2023, at 4:23 PM, Alex Petrov wrote:I think pulling Harry into the tree will make adoption easier for the folks. I have been a bit swamped with Transactional Metadata work, but I wanted to make some of the things we were using for testing TCM available outside of TCM branch. This includes a bunch of helper methods to perform operations on the clusters, data generation, and more useful stuff. Of course, the question always remains about how much time I want to spend porting it all to Gossip, but I think we can find a reasonable compromise. I would not set this improvement as a prerequisite to pulling Harry into the main branch, but rather interpret it as a commitment from myself to take community input and make it more approachable by the day. On Wed, May 24, 2023, at 2:44 PM, Josh McKenzie wrote:importantly it’s a million times better than the dtest-api process - which stymies development due to the friction.This is my major concern.What prompted this thread was harry being external to the core codebase and the lack of adoption and usage of it having led to atrophy of certain aspects of it, which then led to redundant implementation of some fuzz testing and lost time.We'd all be better served to have this closer to the main codebase as a forcing function to smooth out the rough edges, integrate it, and make it a collective artifact and first class citizen IMO.I have similar opinions about the dtest-
Re: [DISCUSS] Bring cassandra-harry in tree as a submodule
I would really like us to split out utilities into a common project, personally. It would be nice to work with a shared palette, including for dtest-api, accord, Harry etc.I think it would help clean up the codebase a bit too, as we have some (minimal) tight coupling with utilities and the C* process.But doubt we have the time for that anytime soon.On 25 May 2023, at 05:04, Caleb Rackliffe wrote:Isn’t the other reason Accord works well as a submodule that it has no dependencies on C* proper? Harry does at the moment, right? (Not that we couldn’t address that…just trying to think this through…)On May 24, 2023, at 6:54 PM, Benedict wrote:In this case Harry is a testing module - it’s not something we will develop in tandem with C* releases, and we will want improvements to be applied across all branches.So it seems a natural fit for submodules to me.On 24 May 2023, at 21:09, Caleb Rackliffe wrote:> Submodules do have their own overhead and edge cases, so I am mostly in favor of using for cases where the code must live outside of tree (such as jvm-dtest that lives out of tree as all branches need the same interfaces)Agreed. Basically where I've ended up on this topic.> We could go over some interesting examples such as testing 2i (SAI)+100On Wed, May 24, 2023 at 1:40 PM Alex Petrov <al...@coffeenco.de> wrote:> I'm about to need to harry test for the paging across tombstone work for https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my own overlapping fuzzing came in). In the process, I'll see if I can't distill something really simple along the lines of how React approaches it (https://react.dev/learn).We can pick that up as an example, sure. On Wed, May 24, 2023, at 4:53 PM, Josh McKenzie wrote:I have submitted a proposal to Cassandra Summit for a 4-hour Harry workshop,I'm about to need to harry test for the paging across tombstone work for https://issues.apache.org/jira/browse/CASSANDRA-18424 (that's where my own overlapping fuzzing came in). In the process, I'll see if I can't distill something really simple along the lines of how React approaches it (https://react.dev/learn).Ideally we'd be able to get something together that's a high level "In the next 15 minutes, you will know and understand A-G and have access to N% of the power of harry" kind of offer.Honestly, there's a lot in our ecosystem where we could benefit from taking a page from their book in terms of onboarding and getting started IMO.On Wed, May 24, 2023, at 10:31 AM, Alex Petrov wrote:> I wonder if a mini-onboarding session would be good as a community session - go over Harry, how to run it, how to add a test? Would that be the right venue? I just would like to see how we can not only plug it in to regular CI but get everyone that wants to add a test be able to know how to get started with it.I have submitted a proposal to Cassandra Summit for a 4-hour Harry workshop, but unfortunately it got declined. Goes without saying, we can still do it online, time and resources permitting. But again, I do not think it should be barring us from making Harry a part of the codebase, as it already is. In fact, we can be iterating on the development quicker having it in-tree. We could go over some interesting examples such as testing 2i (SAI), modelling Group By tests, or testing repair. If there is enough appetite and collaboration in the community, I will see if we can pull something like that together. Input on _what_ you would like to see / hear / tested is also appreciated. Harry was developed out of a strong need for large-scale testing, which also has informed many of its APIs, but we can make it easier to access for interactive testing / unit tests. We have been doing a lot of that with Transactional Metadata, too. > I'll hold off on this until Alex Petrov chimes in. @Alex -> got any thoughts here?Yes, sorry for not responding on this thread earlier. I can not understate how excited I am about this, and how important I think this is. Time constraints are somehow hard to overcome, but I hope the results brought by TCM will make it all worth it.On Wed, May 24, 2023, at 4:23 PM, Alex Petrov wrote:I think pulling Harry into the tree will make adoption easier for the folks. I have been a bit swamped with Transactional Metadata work, but I wanted to make some of the things we were using for testing TCM available outside of TCM branch. This includes a bunch of helper methods to perform operations on the clusters, data generation, and more useful stuff. Of course, the question always remains about how much time I want to spend porting it all to Gossip, but I think we can find a reasonable compromise. I would not set this improvement as a prerequisite to pulling Harry into the main branch, but rather interpret it as a commitment from myself to take community input and make it more approachable by the day. On Wed, May 24, 2023,
Re: Agrona vs fastutil and fastutil-concurrent-wrapper
Given they provide no data or explanation, and that benchmarking is hard, I’m not inclined to give much weight to their analysis.Agrona was favoured in large part due to the perceived quality of the library. I’m not inclined to swap it out without proper evidence the fastutils is both materially faster in a manner care about and of similar quality.On 25 May 2023, at 16:43, Jonathan Ellis wrote:Try it out and see, the only data point I have is that the company who has spent more effort here than anyone else I could find likes fastutil better.On Thu, May 25, 2023 at 10:33 AM Dinesh Joshiwrote:> On May 25, 2023, at 6:14 AM, Jonathan Ellis wrote: > > Any objections to adding the concurrent wrapper and switching out agrona for fastutil? How does fastutil compare to agrona in terms of memory profile and runtime performance? How invasive would it be to switch?-- Jonathan Ellisco-founder, http://www.datastax.com@spyced
Re: Agrona vs fastutil and fastutil-concurrent-wrapper
I’m far less inclined to take that approach to fundamental libraries, where quality is far more important than presentation.On 25 May 2023, at 17:29, David Capwell wrote:Agrona isn’t going anywhere due to the library being more than basic collections.Now, with regard to single-threaded collections… honestly I dislike Agrona as I always fight to avoid boxing; carrot was far better with this regard…. Didn’t look at the fastutil versions to see if they are better here, but I do know I am personally not happy with Agrona primitive collections…I do believe the main motivator for this is that fastutil has a concurrent version of their collections, so you gain access to concurrent primitive collections; something we do not have today… Given the desire for concurrent primitive collections, I am cool with it.I’m not inclined to swap it outWhen it came to random testing libraries, I believe the stance taken before was that we should allow multiple versions and the best one will win eventually… so I am cool having the same stance for primitive collections...On May 25, 2023, at 8:50 AM, Benedict wrote:Given they provide no data or explanation, and that benchmarking is hard, I’m not inclined to give much weight to their analysis.Agrona was favoured in large part due to the perceived quality of the library. I’m not inclined to swap it out without proper evidence the fastutils is both materially faster in a manner care about and of similar quality.On 25 May 2023, at 16:43, Jonathan Ellis wrote:Try it out and see, the only data point I have is that the company who has spent more effort here than anyone else I could find likes fastutil better.On Thu, May 25, 2023 at 10:33 AM Dinesh Joshi <djo...@apache.org> wrote:> On May 25, 2023, at 6:14 AM, Jonathan Ellis <jbel...@gmail.com> wrote: > > Any objections to adding the concurrent wrapper and switching out agrona for fastutil? How does fastutil compare to agrona in terms of memory profile and runtime performance? How invasive would it be to switch?-- Jonathan Ellisco-founder, http://www.datastax.com@spyced
Re: Agrona vs fastutil and fastutil-concurrent-wrapper
Nope, my awareness of Agrona predates Branimir’s proposal, as does others. Aleksey intended to propose its inclusion beforehand also.If all we’re getting is lock striping, do we really need a separate library?On 25 May 2023, at 19:33, Jonathan Ellis wrote:Let's not fall prey to status quo bias, nobody performed an exhaustive analysis of agrona in November. If Branimir had proposed fastutils at the time that's what we'd be using today.On Thu, May 25, 2023 at 10:50 AM Benedict <bened...@apache.org> wrote:Given they provide no data or explanation, and that benchmarking is hard, I’m not inclined to give much weight to their analysis.Agrona was favoured in large part due to the perceived quality of the library. I’m not inclined to swap it out without proper evidence the fastutils is both materially faster in a manner care about and of similar quality.On 25 May 2023, at 16:43, Jonathan Ellis <jbel...@gmail.com> wrote:Try it out and see, the only data point I have is that the company who has spent more effort here than anyone else I could find likes fastutil better.On Thu, May 25, 2023 at 10:33 AM Dinesh Joshi <djo...@apache.org> wrote:> On May 25, 2023, at 6:14 AM, Jonathan Ellis <jbel...@gmail.com> wrote: > > Any objections to adding the concurrent wrapper and switching out agrona for fastutil? How does fastutil compare to agrona in terms of memory profile and runtime performance? How invasive would it be to switch?-- Jonathan Ellisco-founder, http://www.datastax.com@spyced -- Jonathan Ellisco-founder, http://www.datastax.com@spyced
Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)
I agree that this is more suitable as a paging option, and not as a CQL LIMIT option. If it were to be a CQL LIMIT option though, then it should be accurate regarding result set IMO; there shouldn’t be any further results that could have been returned within the LIMIT.On 12 Jun 2023, at 10:16, Benjamin Lerer wrote:Thanks Jacek for raising that discussion.I do not have in mind a scenario where it could be useful to specify a LIMIT in bytes. The LIMIT clause is usually used when you know how many rows you wish to display or use. Unless somebody has a useful scenario in mind I do not think that there is a need for that feature.Paging in bytes makes sense to me as the paging mechanism is transparent for the user in most drivers. It is simply a way to optimize your memory usage from end to end.I do not like the approach of using both of them simultaneously because if you request a page with a certain amount of rows and do not get it then is is really confusing and can be a problem for some usecases. We have users keeping their session open and the page information to display page of data.Le lun. 12 juin 2023 à 09:08, Jacek Lewandowskia écrit :Hi,I was working on limiting query results by their size expressed in bytes, and some questions arose that I'd like to bring to the mailing list.The semantics of queries (without aggregation) - data limits are applied on the raw data returned from replicas - while it works fine for the row number limits as the number of rows is not likely to change after post-processing, it is not that accurate for size based limits as the cell sizes may be different after post-processing (for example due to applying some transformation function, projection, or whatever). We can truncate the results after post-processing to stay within the user-provided limit in bytes, but if the result is smaller than the limit - we will not fetch more. In that case, the meaning of "limit" being an actual limit is valid though it would be misleading for the page size because we will not fetch the maximum amount of data that does not exceed the page size.Such a problem is much more visible for "group by" queries with aggregation. The paging and limiting mechanism is applied to the rows rather than groups, as it has no information about how much memory a single group uses. For now, I've approximated a group size as the size of the largest participating row. The problem concerns the allowed interpretation of the size limit expressed in bytes. Whether we want to use this mechanism to let the users precisely control the size of the resultset, or we instead want to use this mechanism to limit the amount of memory used internally for the data and prevent problems (assuming restricting size and rows number can be used simultaneously in a way that we stop when we reach any of the specified limits).https://issues.apache.org/jira/browse/CASSANDRA-11745thanks,- - -- --- - -Jacek Lewandowski
Re: Improved DeletionTime serialization to reduce disk size
If we’re doing this, why don’t we delta encode a vint from some per-sstable minimum value? I’d expect that to commonly compress to a single byte or so. > On 23 Jun 2023, at 12:55, Aleksey Yeshchenko wrote: > > Distant future people will not be happy about this, I can already tell you > now. > > Sounds like a reasonable improvement to me however. > >> On 23 Jun 2023, at 07:22, Berenguer Blasi wrote: >> >> Hi all, >> >> DeletionTime.markedForDeleteAt is a long useconds since Unix Epoch. But I >> noticed that with 7 bytes we can already encode ~2284 years. We can either >> shed the 8th byte, for reduced IO and disk, or can encode some sentinel >> values (such as LIVE) as flags there. That would mean reading and writing 1 >> byte instead of 12 (8 mfda long + 4 ldts int). Yes we already avoid >> serializing DeletionTime (DT) in sstables at _row_ level entirely but not at >> _partition_ level and it is also serialized at index, metadata, etc. >> >> So here's a POC: https://github.com/bereng/cassandra/commits/ldtdeser-trunk >> and some jmh (1) to evaluate the impact of the new alg (2). It's tested here >> against a 70% and a 30% LIVE DTs to see how we perform: >> >> [java] Benchmark (liveDTPcParam) (sstableParam) Mode Cnt Score >> Error Units >> [java] DeletionTimeDeSerBench.testRawAlgReads 70PcLive NC >> avgt 15 0.331 ± 0.001 ns/op >> [java] DeletionTimeDeSerBench.testRawAlgReads 70PcLive OA >> avgt 15 0.335 ± 0.004 ns/op >> [java] DeletionTimeDeSerBench.testRawAlgReads 30PcLive NC >> avgt 15 0.334 ± 0.002 ns/op >> [java] DeletionTimeDeSerBench.testRawAlgReads 30PcLive OA >> avgt 15 0.340 ± 0.008 ns/op >> [java] DeletionTimeDeSerBench.testNewAlgWrites 70PcLive NC >> avgt 15 0.337 ± 0.006 ns/op >> [java] DeletionTimeDeSerBench.testNewAlgWrites 70PcLive OA >> avgt 15 0.340 ± 0.004 ns/op >> [java] DeletionTimeDeSerBench.testNewAlgWrites 30PcLive NC >> avgt 15 0.339 ± 0.004 ns/op >> [java] DeletionTimeDeSerBench.testNewAlgWrites 30PcLive OA >> avgt 15 0.343 ± 0.016 ns/op >> >> That was ByteBuffer backed to test the extra bit level operations impact. >> But what would be the impact of an end to end test against disk? >> >> [java] Benchmark (diskRAMParam) (liveDTPcParam) (sstableParam) Mode >> Cnt ScoreError Units >> [java] DeletionTimeDeSerBench.testE2EDeSerializeDT RAM 70PcLive >> NC avgt 15 605236.515 ± 19929.058 ns/op >> [java] DeletionTimeDeSerBench.testE2EDeSerializeDT RAM 70PcLive >> OA avgt 15 586477.039 ± 7384.632 ns/op >> [java] DeletionTimeDeSerBench.testE2EDeSerializeDT RAM 30PcLive >> NC avgt 15 937580.311 ± 30669.647 ns/op >> [java] DeletionTimeDeSerBench.testE2EDeSerializeDT RAM 30PcLive >> OA avgt 15 914097.770 ± 9865.070 ns/op >> [java] DeletionTimeDeSerBench.testE2EDeSerializeDT Disk >> 70PcLive NC avgt 15 1314417.207 ± 37879.012 ns/op >> [java] DeletionTimeDeSerBench.testE2EDeSerializeDT Disk >> 70PcLive OA avgt 15 805256.345 ± 15471.587 ns/op >> [java] DeletionTimeDeSerBench.testE2EDeSerializeDT Disk >> 30PcLive NC avgt 15 1583239.011 ± 50104.245 ns/op >> [java] DeletionTimeDeSerBench.testE2EDeSerializeDTDisk >> 30PcLive OA avgt 15 1439605.006 ± 64342.510 ns/op >> [java] DeletionTimeDeSerBench.testE2ESerializeDT RAM >> 70PcLive NC avgt 15 295711.217 ± 5432.507 ns/op >> [java] DeletionTimeDeSerBench.testE2ESerializeDTRAM >> 70PcLive OA avgt 15 305282.827 ± 1906.841 ns/op >> [java] DeletionTimeDeSerBench.testE2ESerializeDT RAM >> 30PcLive NC avgt 15 446029.899 ± 4038.938 ns/op >> [java] DeletionTimeDeSerBench.testE2ESerializeDTRAM 30PcLive >> OA avgt 15 479085.875 ± 10032.804 ns/op >> [java] DeletionTimeDeSerBench.testE2ESerializeDT Disk >> 70PcLive NC avgt 15 1789434.838 ± 206455.771 ns/op >> [java] DeletionTimeDeSerBench.testE2ESerializeDT Disk >> 70PcLive OA avgt 15 589752.861 ± 31676.265 ns/op >> [java] DeletionTimeDeSerBench.testE2ESerializeDTDisk >> 30PcLive NC avgt 15 1754862.122 ± 164903.051 ns/op >> [java] DeletionTimeDeSerBench.testE2ESerializeDT Disk >> 30PcLive OA avgt 15 1252162.253 ± 121626.818 ns/o >> >> We can see big improvements when backed with the disk and little impact from >> the new alg. >> >> Given we're already introducing a new sstable format (OA) in 5.0 I would >