Re: Merging CEP-15 to trunk
contributors who didn't actively work on Accord, have assumed that they will be invited to review nowI may have missed it, but I have not seen anyone propose to substantively review the actual work, only the impact of rebasing. Which, honestly, there is plenty of time to do - the impact is essentially nil, and we aren’t planning to merge immediately. I will only not agree to an adhoc procedural change to prevent merge until this happens, as a matter of principle.However, I am very sympathetic to a desire to participate substantively. I think interested parties should have invested as the work progressed, but I don’t think it is unreasonable to ask for a some time prior to merge if this hasn’t happened.So, if you can adequately resource it, we can delay merging a while longer. I want your (constructive) participation. But, I have not seen anything to suggest this is even proposed, let alone realistic.There are currently five full time contributors participating in the Accord project, with cumulatively several person-years of work already accumulated. By the time even another month has passed, you will have another five person-months of work to catch-up on. Resourcing even a review effort to catch up with this is non-trivial, and for it to be a reasonable ask, you must credibly be able to keep up while making useful contributions.After all, if it had been ready to merge to trunk already a year ago, why wasn't it?The Cassandra integration has only existed since late last year, and was not merged earlier to avoid interfering with the effort to release 4.1.One thing that I wanted to ask for is when you push to CI, you or whoever does it, to approve all jobs.Thanks Ekaterina, we will be sure to fully qualify the CI result, and I will make sure we also run your flaky test runner on the newly introduced tests.On 24 Jan 2023, at 21:42, Henrik Ingo wrote:Thanks JoshSince you mentioned the CEP process, I should also mention one sentiment you omitted, but worth stating explicitly:4. The CEP itself should not be renegotiated at this point. However, the reviewers should rather focus on validating that the implementation matches the CEP. (Or if not, that the deviation is of a good reason and the reviewer agrees to approve it.)Although I'm not personally full time working on either producing Cassandra code or reviewing it, I'm going to spend one more email defending your point #1, because I think your proposal would lead to a lot of inefficiencies in the project, and that does happen to be my job to care about: - Even if you could be right, from some point of view, it's nevertheless the case that those contributors who didn't actively work on Accord, have assumed that they will be invited to review now, when the code is about to land in trunk. Not allowing that to happen would make them feel like they weren't given the opportunity and that the process in Cassandra Project Governance was bypassed. We can agree to work differently in the future, but this is the reality now. - Although those who have collaborated on Accord testify that the code is of the highest quality and ready to be merged to trunk, I don't think that can be expected of every feature branch all the time. In fact, I'm pretty sure the opposite must have been the case also for the Accord branch at some point. After all, if it had been ready to merge to trunk already a year ago, why wasn't it? It's kind of the point of using a feature branch that the code in it is NOT ready to be merged yet. (For example, the existing code might be of high quality, but the work is incomplete, so it shouldn't be merged to trunk.) - Uncertainty: It's completely ok that some feature branches may be abandoned without ever merging to trunk. Requiring the community (anyone potentially interested, anyways) to review such code would obviously be a waste of precious talent. - Time uncertainty: Also - and this is also true for Accord - it is unknown when the merge will happen if it does. In the case of Accord it is now over a year since the CEP was adopted. If I remember correctly an initial target date for some kind of milestone may have been Summer of 2022? Let's say someone in October 2021 was invested in the quality of Cassandra 4.1 release. Should this person now invest in reviewing Accord or not? It's impossible to know. Again, in hindsight we know that the answer is no, but your suggestion again would require the person to review all active feature branches just in case.As for 2 and 3, I certainly observe an assumption that contributors have expected to review after a rebase. But I don't see this as a significant topic to argue about. If indeed the rebase is as easy as Benedict advertised, then we should just do the rebase because apparently it can be done faster than it took me to write this email :-) (But yes, conversely, it seems then that the rebase is not a big reason to hold off from reviewing either.)henrikOn Tue, Jan 24, 2023 at 9:29 PM Josh McKenzie
Re: [DISCUSSION] Cassandra's code style and source code analysis
Thank you Maxim for doing this. It is nice to see this effort materialized in a PR. I would wait until bigger chunks of work are committed to trunk (like CEP-15) to not collide too much. I would say we can postpone doing this until the actual 5.0 release, last weeks before it so we would not clash with any work people would like to include in 5.0. This can go in anytime, basically. Are people on the same page? Regards From: Maxim Muzafarov Sent: Monday, January 23, 2023 19:46 To: dev@cassandra.apache.org Subject: Re: [DISCUSSION] Cassandra's code style and source code analysis NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe. Hello everyone, You can find the changes here: https://issues.apache.org/jira/browse/CASSANDRA-17925 While preparing the code style configuration for the Eclipse IDE, I discovered that there was no easy way to have complex grouping options for the set of packages. So we need to add extra blank lines between each group of packages so that all the configurations for Eclipse, NetBeans, IntelliJ IDEA and checkstyle are aligned. I should have checked this earlier for sure, but I only did it for static imports and some groups, my bad. The resultant configuration looks like this: java.* [blank line] javax.* [blank line] com.* [blank line] net.* [blank line] org.* [blank line] org.apache.cassandra.* [blank line] all other imports [blank line] static all other imports The pull request is here: https://github.com/apache/cassandra/pull/2108 The configuration-related changes are placed in a dedicated commit, so it should be easy to make a review: https://github.com/apache/cassandra/pull/2108/commits/84e292ddc9671a0be76ceb9304b2b9a051c2d52a Another important thing to mention is that the total amount of changes for organising imports is really big (more than 2000 files!), so we need to decide the right time to merge this PR. Although rebasing or merging changes to development branches should become much easier ("Accept local" + "Organize imports"), we still need to pay extra attention here to minimise the impact on major patches for the next release. On Mon, 16 Jan 2023 at 13:16, Maxim Muzafarov wrote: > > Stefan, > > Thank you for bringing this topic up. I'll prepare the PR shortly with > option 4, so everyone can take a look at the amount of changes. This > does not force us to go exactly this path, but it may shed light on > changes in general. > > What exactly we're planning to do in the PR: > > 1. Checkstyle AvoidStarImport rule, so no star imports will be allowed. > 2. Checkstyle ImportOrder rule, for controlling the order. > 3. The IDE code style configuration for Intellij IDEA, NetBeans, and > Eclipse (it doesn't exist for Eclipse yet). > 4. The import order according to option 4: > > ``` > java.* > javax.* > [blank line] > com.* > net.* > org.* > [blank line] > org.apache.cassandra.* > [blank line] > all other imports > [blank line] > static all other imports > ``` > > > > On Mon, 16 Jan 2023 at 12:39, Miklosovic, Stefan > wrote: > > > > Based on the voting we should go with option 4? > > > > Two weeks passed without anybody joining so I guess folks are all happy > > with that or this just went unnoticed? > > > > Let's give it time until the end of this week (Friday 12:00 UTC). > > > > Regards > > > > > > From: Maxim Muzafarov > > Sent: Tuesday, January 3, 2023 14:31 > > To: dev@cassandra.apache.org > > Subject: Re: [DISCUSSION] Cassandra's code style and source code analysis > > > > NetApp Security WARNING: This is an external email. Do not click links or > > open attachments unless you recognize the sender and know the content is > > safe. > > > > > > > > > > Folks, > > > > Let me update the voting status and put together everything we have so > > far. We definitely need more votes to have a solid foundation for this > > change, so I encourage everyone to consider the options above and > > share them in this thread. > > > > > > Total for each applicable option: > > > > 4-th option -- 4 votes > > 3-rd option -- 3 votes > > 5-th option -- 1 vote > > 1-st option -- 0 votes > > 2-nd option -- 0 votes > > > > On Thu, 22 Dec 2022 at 22:06, Mick Semb Wever wrote: > > >> > > >> > > >> 3. Total 5 groups, 2968 files to change > > >> > > >> ``` > > >> org.apache.cassandra.* > > >> [blank line] > > >> java.* > > >> [blank line] > > >> javax.* > > >> [blank line] > > >> all other imports > > >> [blank line] > > >> static all other imports > > >> ``` > > > > > > > > > > > > 3, then 5. > > > There's lots under com.*, net.*, org.* that is essentially the same as > > > "all other imports", what's the reason to separate those? > > > > > > My preference for 3 is simply that imports are by default collapsed, and > > > if I expand them it's the dependencies on ot
Re: Cassandra Summit update for 2023-01-24
> > *To create a more neutral ground that reflects our community better, Linux > Foundation Events has taken on the considerable task of running Cassandra > Summit in 2023. We are very grateful they took a chance on our community, > and we will be better for it. * > *…* > *Why is this important to mention? Our community needs an independent > Cassandra Summit, and right now, it needs your support in attending the > event. Let’s show the Linux Foundation that Cassandra Summit is something > we value as a community. I know budgets are tight, and it’s hard to get > approval. If you are able, make the case and register today. * > I particularly appreciate and am inspired by this. Well worded, thank you Patrick!
Re: Merging CEP-15 to trunk
Thanks Benedict For brevity I'll respond to your email, although indirectly this is also a continuation of my debate with Josh: At least on my scorecard, one issue was raised regarding the actual code: CASSANDRA-18193 Provide design and API documentation. Since the addition of code comments also significantly impacts the ability of an outsider to understand and review the code, I would then treat it as an unknown to say how much else such a fresh review would uncover. By the way I would say the discussion about git submodules (and all the other alternatives) in a broad sense was also a review'ish comment. That said, yes of course the expectation is that if the code has already been reviewed, and by rather experienced Cassandra developers too, there probably won't be many issues found, and there isn't a need for several weeks of line by line re-review. As for the rebase, I think that somehow started dominating this discussion, but in my view was never the only reason. For me this is primarily to satisfy points 4 and 5 in the project governance, that everyone has had an opportunity to review the code, for whatever reason they may wish to do so. I should say for those of us on the sidelines we certainly expected a rebase catching up 6 months and ~500 commits to have more substantial changes. Hearing that this is not the case is encouraging, as it also suggests the changes to Cassandra code are less invasive than maybe I and others had imagined. henrik On Wed, Jan 25, 2023 at 1:51 PM Benedict wrote: > contributors who didn't actively work on Accord, have assumed that they > will be invited to review now > > > I may have missed it, but I have not seen anyone propose to substantively > review the actual *work*, only the impact of rebasing. Which, honestly, > there is plenty of time to do - the impact is essentially nil, and we > aren’t planning to merge immediately. I will only not agree to an adhoc > procedural change to prevent merge until this happens, as a matter of > principle. > > However, I am very sympathetic to a desire to participate *substantively*. > I think interested parties should have invested as the work progressed, but > I *don’t* think it is unreasonable to ask for a *some* time prior to > merge if this hasn’t happened. > > So, if you can adequately resource it, we can delay merging a while > longer. I *want* your (constructive) participation. But, I have not seen > anything to suggest this is even proposed, let alone realistic. > > There are currently five full time contributors participating in the > Accord project, with cumulatively several person-years of work already > accumulated. By the time even another month has passed, you will have > another five person-months of work to catch-up on. Resourcing even a review > effort to catch up with this is *non-trivial*, and for it to be a > reasonable ask, you must credibly be able to keep up while making useful > contributions. > > After all, if it had been ready to merge to trunk already a year ago, why > wasn't it? > > > The Cassandra integration has only existed since late last year, and was > not merged earlier to avoid interfering with the effort to release 4.1. > > One thing that I wanted to ask for is when you push to CI, you or whoever > does it, to approve all jobs. > > > Thanks Ekaterina, we will be sure to fully qualify the CI result, and I > will make sure we also run your flaky test runner on the newly introduced > tests. > > > > > On 24 Jan 2023, at 21:42, Henrik Ingo wrote: > > > Thanks Josh > > Since you mentioned the CEP process, I should also mention one sentiment > you omitted, but worth stating explicitly: > > 4. The CEP itself should not be renegotiated at this point. However, the > reviewers should rather focus on validating that the implementation matches > the CEP. (Or if not, that the deviation is of a good reason and the > reviewer agrees to approve it.) > > > Although I'm not personally full time working on either producing > Cassandra code or reviewing it, I'm going to spend one more email defending > your point #1, because I think your proposal would lead to a lot of > inefficiencies in the project, and that does happen to be my job to care > about: > > - Even if you could be right, from some point of view, it's nevertheless > the case that those contributors who didn't actively work on Accord, have > assumed that they will be invited to review now, when the code is about to > land in trunk. Not allowing that to happen would make them feel like they > weren't given the opportunity and that the process in Cassandra Project > Governance was bypassed. We can agree to work differently in the future, > but this is the reality now. > > - Although those who have collaborated on Accord testify that the code is > of the highest quality and ready to be merged to trunk, I don't think that > can be expected of every feature branch all the time. In fact, I'm pretty > sure the opposite must have been the case also f
Re: Merging CEP-15 to trunk
Hi Josh I chose to mainly reply to Benedict's latest email as a reply to both of you, but came back here only for a single higher level comment: I'm not aware of the project history of such mega reviews, other than years later, indirectly I have maybe felt the impact to quality that such large commits (late in the release cycle...) typically have on a complex code base. But sure, I can see how that could have happened and understand it would then impact this discussion too. Fundamentally we have two conflicting interests at play: - merging smaller incremental changes is preferable to large merges - merging incomplete work is usually a bad idea, it's better to work on a feature branch until some kind of mvp or v1 level of functionality is met. We'll just have to learn to balance both of these. Related to the history though, one thing that has changed is the introduction of CEPs. I actually expect this to make a huge difference compared to historical traumatic experiences. Because at this point we definitely do not need to discuss a) whether we want transactions in the first place, or b) how they should (have) be(en) implemented. Also, the fact that code going into the feature branch was already reviewed with the same rigor as a trunk merge would, should of course make a big difference too. So in summary, I'm optimistic that the processes we are following today will work better than what was in the past. (If anything, the CI worries me a lot more than the review process. If we had an automated merge pipeline that would do the rebase-test-merge in an automated and uncompromising way, I bet even this discussion would have been more relaxed.) henrik On Wed, Jan 25, 2023 at 12:11 AM Josh McKenzie wrote: > Cordial debate! <3 > > - it's nevertheless the case that those contributors who didn't actively > work on Accord, have assumed that they will be invited to review now, when > the code is about to land in trunk. Not allowing that to happen would make > them feel like they weren't given the opportunity and that the process in > Cassandra Project Governance was bypassed. We can agree to work differently > in the future, but this is the reality now. > > If this was a miscommunication on this instance rectifying it will of > course require compromise from all parties. Good learning for future > engagement and hopefully the outcome of this discussion is clearer norms as > a project so we don't end up with this miscommunication in the future. > > the code is of the highest quality and ready to be merged to trunk, I > don't think that can be expected of every feature branch all the time > > I think this is something we can either choose to make a formal > requirement for feature branches in ASF git (all code that goes in has 2 > committers hands on) or not. If folks want to work on other feature > branches in other forks w/out this bar and then have a "mega review" at the > end, I suppose that's their prerogative. Many of us that have been on the > project for years have _significant emotional and psychological scars_ from > that approach however, and multiple large efforts have failed at the > "mega-review and merge" step. So I wouldn't advocate for that approach (and > it's the only logical alternative I can think of to incremental bar of > quality reinforcement throughout a work cycle on a large feature over time). > > if it had been ready to merge to trunk already a year ago, why wasn't it? > It's kind of the point of using a feature branch that the code in it is NOT > ready to be merged yet > > Right now we culturally tend to avoid merging code that doesn't do > anything, for better or worse. We don't have a strong culture of either > incremental merge in during development or of using the experimental flag > for new features. Much of the tightly coupled nature of our codebase makes > this a necessity for keeping velocity while working unfortunately. So in > this case I would qualify that "it's not ready to be merged yet given our > assumption that all code in the codebase should serve an active immediate > purpose, not due to a lack of merge-level quality". > > The approach of "hold the same bar for merges into a feature branch as > trunk" seems to be a compromise between Big Bang single commit drops and > peppering trunk with a lot of "as yet dormant" incremental code as a large > feature is built out. Not saying it's better or worse, just describing the > contour of the tradeoffs as I see them. > > - Uncertainty: It's completely ok that some feature branches may be > abandoned without ever merging to trunk. Requiring the community (anyone > potentially interested, anyways) to review such code would obviously be a > waste of precious talent. > > This is an excellent point. The only mitigation I'd see for this would be > an additional review period or burden collectively before merge of a > feature branch into trunk once something has crossed a threshold of success > as to be included, or stepping aw
Re: Cassandra Summit update for 2023-01-24
Hugely excited to this – thanks to the Program Committee and to the Linux Foundation for organizing!It's been a long few years away from conferences and I can't wait to see all of you.Beyond learning about what everyone is doing with Apache Cassandra, I'm looking forward to the hallway chats and discussion among smaller forums.Discussions at ApacheCon in 2019 helped hash out a plan to stabilize and ship 4.0, and ApacheCon in 2022 led to a number of brainstorming sessions on how much better we can make Cassandra as features like distributed transactions, transactional metadata, storage-attached indexes, the BTI SSTable format, and more fall into place.If you or your company use Cassandra, the project's contributors would love to learn more about how - and how we can make it better. This conference is an awesome opportunity to catch up with contributors face to face. Always happy to discuss on the list and via Slack too, but glad this year allows us to resume events in person.Hope to see you there!– ScottOn Jan 24, 2023, at 4:54 PM, Patrick McFadin wrote:Hello Cassandra Community!Quick take:- Register before 1/28 to get discount pricing. https://events.linuxfoundation.org/cassandra-summit/register/- Use code CS23DS20 to get 20% off- Make sure and sign up for training the day on March 12- Tell everyone you’re going on social media and use #CassandraSummit in your postsLonger version:If you have been watching what’s happening with the Cassandra Summit and thinking about going, I’m here to convince you that now is the time to register. The early registration discount ends this Saturday, January 28th. It might be helpful to clarify some misconceptions I keep hearing. Every other Cassandra Summit (except Cassandra Summit Tokyo) has been an event planned and run by DataStax. To create a more neutral ground that reflects our community better, Linux Foundation Events has taken on the considerable task of running Cassandra Summit in 2023. We are very grateful they took a chance on our community, and we will be better for it.When DataStax ran the event, we could deeply discount tickets because we treated it as a marketing expense. I’ve been DMed and Slacked quite a few times for free passes. Since this is a Linux Foundation event, unfortunately, there are no complimentary passes, as this is a key part of recouping their costs. You can get a 20% discount by using this code: CS23DS20Why is this important to mention? Our community needs an independent Cassandra Summit, and right now, it needs your support in attending the event. Let’s show the Linux Foundation that Cassandra Summit is something we value as a community. I know budgets are tight, and it’s hard to get approval. If you are able, make the case and register today. Next year when there are thousands of attendees at Cassandra Summit, you can tell everyone what they missed in 2023. If making the trip isn’t something you can do, a virtual pass is only $30 with the discount code and is also a great way to show support.The other important thing you can help with? Getting out the word about Cassandra Summit. Tell your colleagues and co-workers that this is a hot tip and you are hooking them up. If you are going, tell everyone you’ve registered and use the hashtag #CassandraSummit. Point out sessions you are interested in and share the love. If you can convince a couple of people to go, you’ve made a difference. If you need a little more motivation, just look at this schedule! https://events.linuxfoundation.org/cassandra-summit/program/schedule/Thanks, and I hope to see you there!Patrick
[DISCUSSION] Framework for Internal Collection Exposure and Monitoring API Alignment
Hello Cassandra Community, I've been faced with a number of inconsistencies in the user APIs of the internal data collections representation exposed through the Cassandra monitoring interfaces that need to be fully aligned from an operator perspective. First of all, I'm highlighting JMX, Dropwizard Metrics, and Virtual Tables user interfaces. In order to address all these inconsistencies, I have created a draft enhancement proposal that describes everything I have found and how we can fix it once and for all. I'd like to hear your opinion and thoughts on it. Please take a look: https://docs.google.com/document/d/1j4J3bPWjQkAU9x4G-zxKObxPrKg36jLRT6xpUoNJa8Q -- Maxim Muzafarov
Apache Cassandra 5.0 documentation
Greetings! I'm gearing up to help get the Cassandra 5.0 docs in good order before the GA release occurs later this year. Recently, I've been thinking about a more standardized organization to docs, to make it simpler for users to find what they are looking for, separate from searching. [That's the kind of thing docs nerds think about.] To that end, I've created a unified information architecture (IA) that can apply to any kind of documentation, including the Apache C* docs. Up front, I'll say, not every section of this organization applies to Apache C* docs, but reorganizing the docs to follow this pattern as much as possible will help users find what they need. I'd like your input into this IA that I've outlined. Please give me feedback about your opinions! If I can tackle this issue before launching into adding CEP features, working down the existing JIRA tickets for documentation, and backfilling missing items, it would be immensely helpful. No opinion will go unaddressed, so please take a few minutes to take a look. I'm linking a google doc, to make it easy for anyone to make comments: https://docs.google.com/document/d/1A96K73vj9MbJoD7wJNgIKWrOkLq-ZL2cNZAEXSWrciY/edit?usp=sharing I'm also drafting an Apache C* 5.0 Doc Plan for the work, to make it simple for anyone to know what is being done, and will share that next. In addition, I've started consolidating the current Documentation tickets that are open under the JIRA project, component "Documentation". Thanks, Lorina Poland