Re: Merging CEP-15 to trunk

2023-01-25 Thread Benedict
contributors who didn't actively work on Accord, have assumed that they will be invited to review nowI may have missed it, but I have not seen anyone propose to substantively review the actual work, only the impact of rebasing. Which, honestly, there is plenty of time to do - the impact is essentially nil, and we aren’t planning to merge immediately. I will only not agree to an adhoc procedural change to prevent merge until this happens, as a matter of principle.However, I am very sympathetic to a desire to participate substantively. I think interested parties should have invested as the work progressed, but I don’t think it is unreasonable to ask for a some time prior to merge if this hasn’t happened.So, if you can adequately resource it, we can delay merging a while longer. I want your (constructive) participation. But, I have not seen anything to suggest this is even proposed, let alone realistic.There are currently five full time contributors participating in the Accord project, with cumulatively several person-years of work already accumulated. By the time even another month has passed, you will have another five person-months of work to catch-up on. Resourcing even a review effort to catch up with this is non-trivial, and for it to be a reasonable ask, you must credibly be able to keep up while making useful contributions.After all, if it had been ready to merge to trunk already a year ago, why wasn't it?The Cassandra integration has only existed since late last year, and was not merged earlier to avoid interfering with the effort to release 4.1.One thing that I wanted to ask for is when you push to CI, you or whoever does it, to approve all jobs.Thanks Ekaterina, we will be sure to fully qualify the CI result, and I will make sure we also run your flaky test runner on the newly introduced tests.On 24 Jan 2023, at 21:42, Henrik Ingo  wrote:Thanks JoshSince you mentioned the CEP process, I should also mention one sentiment you omitted, but worth stating explicitly:4. The CEP itself should not be renegotiated at this point. However, the reviewers should rather focus on validating that the implementation matches the CEP. (Or if not, that the deviation is of a good reason and the reviewer agrees to approve it.)Although I'm not personally full time working on either producing Cassandra code or reviewing it, I'm going to spend one more email defending your point #1, because I think your proposal would lead to a lot of inefficiencies in the project, and that does happen to be my job to care about:  - Even if you could be right, from some point of view, it's nevertheless the case that those contributors who didn't actively work on Accord, have assumed that they will be invited to review now, when the code is about to land in trunk. Not allowing that to happen would make them feel like they weren't given the opportunity and that the process in Cassandra Project Governance was bypassed. We can agree to work differently in the future, but this is the reality now. - Although those who have collaborated on Accord testify that the code is of the highest quality and ready to be merged to trunk, I don't think that can be expected of every feature branch all the time. In fact, I'm pretty sure the opposite must have been the case also for the Accord branch at some point. After all, if it had been ready to merge to trunk already a year ago, why wasn't it? It's kind of the point of using a feature branch that the code in it is NOT ready to be merged yet. (For example, the existing code might be of high quality, but the work is incomplete, so it shouldn't be merged to trunk.) - Uncertainty: It's completely ok that some feature branches may be abandoned without ever merging to trunk. Requiring the community (anyone potentially interested, anyways) to review such code would obviously be a waste of precious talent. - Time uncertainty: Also - and this is also true for Accord - it is unknown when the merge will happen if it does. In the case of Accord it is now over a year since the CEP was adopted. If I remember correctly an initial target date for some kind of milestone may have been Summer of 2022? Let's say someone in October 2021 was invested in the quality of Cassandra 4.1 release. Should this person now invest in reviewing Accord or not? It's impossible to know. Again, in hindsight we know that the answer is no, but your suggestion again would require the person to review all active feature branches just in case.As for 2 and 3, I certainly observe an assumption that contributors have expected to review after a rebase. But I don't see this as a significant topic to argue about. If indeed the rebase is as easy as Benedict advertised, then we should just do the rebase because apparently it can be done faster than it took me to write this email :-) (But yes, conversely, it seems then that the rebase is not a big reason to hold off from reviewing either.)henrikOn Tue, Jan 24, 2023 at 9:29 PM Josh McKenzie 

Re: [DISCUSSION] Cassandra's code style and source code analysis

2023-01-25 Thread Miklosovic, Stefan
Thank you Maxim for doing this.

It is nice to see this effort materialized in a PR.

I would wait until bigger chunks of work are committed to trunk (like CEP-15) 
to not collide too much. I would say we can postpone doing this until the 
actual 5.0 release, last weeks before it so we would not clash with any work 
people would like to include in 5.0. This can go in anytime, basically.

Are people on the same page?

Regards


From: Maxim Muzafarov 
Sent: Monday, January 23, 2023 19:46
To: dev@cassandra.apache.org
Subject: Re: [DISCUSSION] Cassandra's code style and source code analysis

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.




Hello everyone,

You can find the changes here:
https://issues.apache.org/jira/browse/CASSANDRA-17925

While preparing the code style configuration for the Eclipse IDE, I
discovered that there was no easy way to have complex grouping options
for the set of packages. So we need to add extra blank lines between
each group of packages so that all the configurations for Eclipse,
NetBeans, IntelliJ IDEA and checkstyle are aligned. I should have
checked this earlier for sure, but I only did it for static imports
and some groups, my bad. The resultant configuration looks like this:

java.*
[blank line]
javax.*
[blank line]
com.*
[blank line]
net.*
[blank line]
org.*
[blank line]
org.apache.cassandra.*
[blank line]
all other imports
[blank line]
static all other imports

The pull request is here:
https://github.com/apache/cassandra/pull/2108

The configuration-related changes are placed in a dedicated commit, so
it should be easy to make a review:
https://github.com/apache/cassandra/pull/2108/commits/84e292ddc9671a0be76ceb9304b2b9a051c2d52a



Another important thing to mention is that the total amount of changes
for organising imports is really big (more than 2000 files!), so we
need to decide the right time to merge this PR. Although rebasing or
merging changes to development branches should become much easier
("Accept local" + "Organize imports"), we still need to pay extra
attention here to minimise the impact on major patches for the next
release.

On Mon, 16 Jan 2023 at 13:16, Maxim Muzafarov  wrote:
>
> Stefan,
>
> Thank you for bringing this topic up. I'll prepare the PR shortly with
> option 4, so everyone can take a look at the amount of changes. This
> does not force us to go exactly this path, but it may shed light on
> changes in general.
>
> What exactly we're planning to do in the PR:
>
> 1. Checkstyle AvoidStarImport rule, so no star imports will be allowed.
> 2. Checkstyle ImportOrder rule, for controlling the order.
> 3. The IDE code style configuration for Intellij IDEA, NetBeans, and
> Eclipse (it doesn't exist for Eclipse yet).
> 4. The import order according to option 4:
>
> ```
> java.*
> javax.*
> [blank line]
> com.*
> net.*
> org.*
> [blank line]
> org.apache.cassandra.*
> [blank line]
> all other imports
> [blank line]
> static all other imports
> ```
>
>
>
> On Mon, 16 Jan 2023 at 12:39, Miklosovic, Stefan
>  wrote:
> >
> > Based on the voting we should go with option 4?
> >
> > Two weeks passed without anybody joining so I guess folks are all happy 
> > with that or this just went unnoticed?
> >
> > Let's give it time until the end of this week (Friday 12:00 UTC).
> >
> > Regards
> >
> > 
> > From: Maxim Muzafarov 
> > Sent: Tuesday, January 3, 2023 14:31
> > To: dev@cassandra.apache.org
> > Subject: Re: [DISCUSSION] Cassandra's code style and source code analysis
> >
> > NetApp Security WARNING: This is an external email. Do not click links or 
> > open attachments unless you recognize the sender and know the content is 
> > safe.
> >
> >
> >
> >
> > Folks,
> >
> > Let me update the voting status and put together everything we have so
> > far. We definitely need more votes to have a solid foundation for this
> > change, so I encourage everyone to consider the options above and
> > share them in this thread.
> >
> >
> > Total for each applicable option:
> >
> > 4-th option -- 4 votes
> > 3-rd option -- 3 votes
> > 5-th option -- 1 vote
> > 1-st option -- 0 votes
> > 2-nd option -- 0 votes
> >
> > On Thu, 22 Dec 2022 at 22:06, Mick Semb Wever  wrote:
> > >>
> > >>
> > >> 3. Total 5 groups, 2968 files to change
> > >>
> > >> ```
> > >> org.apache.cassandra.*
> > >> [blank line]
> > >> java.*
> > >> [blank line]
> > >> javax.*
> > >> [blank line]
> > >> all other imports
> > >> [blank line]
> > >> static all other imports
> > >> ```
> > >
> > >
> > >
> > > 3, then 5.
> > > There's lots under com.*, net.*, org.* that is essentially the same as 
> > > "all other imports", what's the reason to separate those?
> > >
> > > My preference for 3 is simply that imports are by default collapsed, and 
> > > if I expand them it's the dependencies on ot

Re: Cassandra Summit update for 2023-01-24

2023-01-25 Thread Mick Semb Wever
>
> *To create a more neutral ground that reflects our community better, Linux
> Foundation Events has taken on the considerable task of running Cassandra
> Summit in 2023. We are very grateful they took a chance on our community,
> and we will be better for it.  *
>
*…*
>
*Why is this important to mention? Our community needs an independent
> Cassandra Summit, and right now, it needs your support in attending the
> event. Let’s show the Linux Foundation that Cassandra Summit is something
> we value as a community. I know budgets are tight, and it’s hard to get
> approval. If you are able, make the case and register today. *
>


I particularly appreciate and am inspired by this. Well worded, thank you
Patrick!


Re: Merging CEP-15 to trunk

2023-01-25 Thread Henrik Ingo
Thanks Benedict

For brevity I'll respond to your email, although indirectly this is also a
continuation of my debate with Josh:

At least on my scorecard, one issue was raised regarding the actual code:
CASSANDRA-18193 Provide design and API documentation. Since the addition of
code comments also significantly impacts the ability of an outsider to
understand and review the code, I would then treat it as an unknown to say
how much else such a fresh review would uncover.

By the way I would say the discussion about git submodules (and all the
other alternatives) in a broad sense was also a review'ish comment.

That said, yes of course the expectation is that if the code has already
been reviewed, and by rather experienced Cassandra developers too, there
probably won't be many issues found, and there isn't a need for several
weeks of line by line re-review.

As for the rebase, I think that somehow started dominating this discussion,
but in my view was never the only reason. For me this is primarily to
satisfy points 4 and 5 in the project governance, that everyone has had an
opportunity to review the code, for whatever reason they may wish to do so.

I should say for those of us on the sidelines we certainly expected a
rebase catching up 6 months and ~500 commits to have more substantial
changes. Hearing that this is not the case is encouraging, as it also
suggests the changes to Cassandra code are less invasive than maybe I and
others had imagined.

henrik

On Wed, Jan 25, 2023 at 1:51 PM Benedict  wrote:

> contributors who didn't actively work on Accord, have assumed that they
> will be invited to review now
>
>
> I may have missed it, but I have not seen anyone propose to substantively
> review the actual *work*, only the impact of rebasing. Which, honestly,
> there is plenty of time to do - the impact is essentially nil, and we
> aren’t planning to merge immediately. I will only not agree to an adhoc
> procedural change to prevent merge until this happens, as a matter of
> principle.
>
> However, I am very sympathetic to a desire to participate *substantively*.
> I think interested parties should have invested as the work progressed, but
> I *don’t* think it is unreasonable to ask for a *some* time prior to
> merge if this hasn’t happened.
>
> So, if you can adequately resource it, we can delay merging a while
> longer. I *want* your (constructive) participation. But, I have not seen
> anything to suggest this is even proposed, let alone realistic.
>
> There are currently five full time contributors participating in the
> Accord project, with cumulatively several person-years of work already
> accumulated. By the time even another month has passed, you will have
> another five person-months of work to catch-up on. Resourcing even a review
> effort to catch up with this is *non-trivial*, and for it to be a
> reasonable ask, you must credibly be able to keep up while making useful
> contributions.
>
> After all, if it had been ready to merge to trunk already a year ago, why
> wasn't it?
>
>
> The Cassandra integration has only existed since late last year, and was
> not merged earlier to avoid interfering with the effort to release 4.1.
>
> One thing that I wanted to ask for is when you push to CI, you or whoever
> does it, to approve all jobs.
>
>
> Thanks Ekaterina, we will be sure to fully qualify the CI result, and I
> will make sure we also run your flaky test runner on the newly introduced
> tests.
>
>
>
>
> On 24 Jan 2023, at 21:42, Henrik Ingo  wrote:
>
> 
> Thanks Josh
>
> Since you mentioned the CEP process, I should also mention one sentiment
> you omitted, but worth stating explicitly:
>
> 4. The CEP itself should not be renegotiated at this point. However, the
> reviewers should rather focus on validating that the implementation matches
> the CEP. (Or if not, that the deviation is of a good reason and the
> reviewer agrees to approve it.)
>
>
> Although I'm not personally full time working on either producing
> Cassandra code or reviewing it, I'm going to spend one more email defending
> your point #1, because I think your proposal would lead to a lot of
> inefficiencies in the project, and that does happen to be my job to care
> about:
>
>  - Even if you could be right, from some point of view, it's nevertheless
> the case that those contributors who didn't actively work on Accord, have
> assumed that they will be invited to review now, when the code is about to
> land in trunk. Not allowing that to happen would make them feel like they
> weren't given the opportunity and that the process in Cassandra Project
> Governance was bypassed. We can agree to work differently in the future,
> but this is the reality now.
>
>  - Although those who have collaborated on Accord testify that the code is
> of the highest quality and ready to be merged to trunk, I don't think that
> can be expected of every feature branch all the time. In fact, I'm pretty
> sure the opposite must have been the case also f

Re: Merging CEP-15 to trunk

2023-01-25 Thread Henrik Ingo
Hi Josh

I chose to mainly reply to Benedict's latest email as a reply to both of
you, but came back here only for a single higher level comment:

I'm not aware of the project history of such mega reviews, other than years
later, indirectly I have maybe felt the impact to quality that such large
commits (late in the release cycle...) typically have on a complex code
base. But sure, I can see how that could have happened and understand it
would then impact this discussion too.

Fundamentally we have two conflicting interests at play:
 - merging smaller incremental changes is preferable to large merges
 - merging incomplete work is usually a bad idea, it's better to work on a
feature branch until some kind of mvp or v1 level of functionality is met.

We'll just have to learn to balance both of these.

Related to the history though, one thing that has changed is the
introduction of CEPs. I actually expect this to make a huge difference
compared to historical traumatic experiences. Because at this point we
definitely do not need to discuss a) whether we want transactions in the
first place, or b) how they should (have) be(en)  implemented.

Also, the fact that code going into the feature branch was already reviewed
with the same rigor as a trunk merge would, should of course make a big
difference too. So in summary, I'm optimistic that the processes we are
following today will work better than what was in the past. (If anything,
the CI worries me a lot more than the review process. If we had an
automated merge pipeline that would do the rebase-test-merge in an
automated and uncompromising way, I bet even this discussion would have
been more relaxed.)

henrik

On Wed, Jan 25, 2023 at 12:11 AM Josh McKenzie  wrote:

> Cordial debate! <3
>
> - it's nevertheless the case that those contributors who didn't actively
> work on Accord, have assumed that they will be invited to review now, when
> the code is about to land in trunk. Not allowing that to happen would make
> them feel like they weren't given the opportunity and that the process in
> Cassandra Project Governance was bypassed. We can agree to work differently
> in the future, but this is the reality now.
>
> If this was a miscommunication on this instance rectifying it will of
> course require compromise from all parties. Good learning for future
> engagement and hopefully the outcome of this discussion is clearer norms as
> a project so we don't end up with this miscommunication in the future.
>
> the code is of the highest quality and ready to be merged to trunk, I
> don't think that can be expected of every feature branch all the time
>
> I think this is something we can either choose to make a formal
> requirement for feature branches in ASF git (all code that goes in has 2
> committers hands on) or not. If folks want to work on other feature
> branches in other forks w/out this bar and then have a "mega review" at the
> end, I suppose that's their prerogative. Many of us that have been on the
> project for years have _significant emotional and psychological scars_ from
> that approach however, and multiple large efforts have failed at the
> "mega-review and merge" step. So I wouldn't advocate for that approach (and
> it's the only logical alternative I can think of to incremental bar of
> quality reinforcement throughout a work cycle on a large feature over time).
>
> if it had been ready to merge to trunk already a year ago, why wasn't it?
> It's kind of the point of using a feature branch that the code in it is NOT
> ready to be merged yet
>
> Right now we culturally tend to avoid merging code that doesn't do
> anything, for better or worse. We don't have a strong culture of either
> incremental merge in during development or of using the experimental flag
> for new features. Much of the tightly coupled nature of our codebase makes
> this a necessity for keeping velocity while working unfortunately. So in
> this case I would qualify that "it's not ready to be merged yet given our
> assumption that all code in the codebase should serve an active immediate
> purpose, not due to a lack of merge-level quality".
>
> The approach of "hold the same bar for merges into a feature branch as
> trunk" seems to be a compromise between Big Bang single commit drops and
> peppering trunk with a lot of "as yet dormant" incremental code as a large
> feature is built out. Not saying it's better or worse, just describing the
> contour of the tradeoffs as I see them.
>
> - Uncertainty: It's completely ok that some feature branches may be
> abandoned without ever merging to trunk. Requiring the community (anyone
> potentially interested, anyways) to review such code would obviously be a
> waste of precious talent.
>
> This is an excellent point. The only mitigation I'd see for this would be
> an additional review period or burden collectively before merge of a
> feature branch into trunk once something has crossed a threshold of success
> as to be included, or stepping aw

Re: Cassandra Summit update for 2023-01-24

2023-01-25 Thread C. Scott Andreas

Hugely excited to this – thanks to the Program Committee and to the Linux Foundation 
for organizing!It's been a long few years away from conferences and I can't wait to 
see all of you.Beyond learning about what everyone is doing with Apache Cassandra, 
I'm looking forward to the hallway chats and discussion among smaller 
forums.Discussions at ApacheCon in 2019 helped hash out a plan to stabilize and ship 
4.0, and ApacheCon in 2022 led to a number of brainstorming sessions on how much 
better we can make Cassandra as features like distributed transactions, transactional 
metadata, storage-attached indexes, the BTI SSTable format, and more fall into 
place.If you or your company use Cassandra, the project's contributors would love to 
learn more about how - and how we can make it better. This conference is an awesome 
opportunity to catch up with contributors face to face. Always happy to discuss on 
the list and via Slack too, but glad this year allows us to resume events in 
person.Hope to see you there!– ScottOn Jan 24, 2023, at 4:54 PM, Patrick McFadin 
 wrote:Hello Cassandra Community!Quick take:- Register 
before 1/28 to get discount pricing. 
https://events.linuxfoundation.org/cassandra-summit/register/- Use code CS23DS20 to 
get 20% off- Make sure and sign up for training the day on March 12- Tell everyone 
you’re going on social media and use #CassandraSummit in your postsLonger version:If 
you have been watching what’s happening with the Cassandra Summit and thinking about 
going, I’m here to convince you that now is the time to register. The early 
registration discount ends this Saturday, January 28th. It might be helpful to 
clarify some misconceptions I keep hearing. Every other Cassandra Summit (except 
Cassandra Summit Tokyo) has been an event planned and run by DataStax. To create a 
more neutral ground that reflects our community better, Linux Foundation Events has 
taken on the considerable task of running Cassandra Summit in 2023. We are very 
grateful they took a chance on our community, and we will be better for it.When 
DataStax ran the event, we could deeply discount tickets because we treated it as a 
marketing expense. I’ve been DMed and Slacked quite a few times for free passes. 
Since this is a Linux Foundation event, unfortunately, there are no complimentary 
passes, as this is a key part of recouping their costs. You can get a 20% discount by 
using this code: CS23DS20Why is this important to mention? Our community needs an 
independent Cassandra Summit, and right now, it needs your support in attending the 
event. Let’s show the Linux Foundation that Cassandra Summit is something we value as 
a community. I know budgets are tight, and it’s hard to get approval. If you are 
able, make the case and register today. Next year when there are thousands of 
attendees at Cassandra Summit, you can tell everyone what they missed in 2023. If 
making the trip isn’t something you can do, a virtual pass is only $30 with the 
discount code and is also a great way to show support.The other important thing you 
can help with? Getting out the word about Cassandra Summit. Tell your colleagues and 
co-workers that this is a hot tip and you are hooking them up. If you are going, tell 
everyone you’ve registered and use the hashtag #CassandraSummit. Point out sessions 
you are interested in and share the love. If you can convince a couple of people to 
go, you’ve made a difference. If you need a little more motivation, just look at this 
schedule! 
https://events.linuxfoundation.org/cassandra-summit/program/schedule/Thanks, and I 
hope to see you there!Patrick

[DISCUSSION] Framework for Internal Collection Exposure and Monitoring API Alignment

2023-01-25 Thread Maxim Muzafarov
Hello Cassandra Community,


I've been faced with a number of inconsistencies in the user APIs of
the internal data collections representation exposed through the
Cassandra monitoring interfaces that need to be fully aligned from an
operator perspective. First of all, I'm highlighting JMX, Dropwizard
Metrics, and Virtual Tables user interfaces. In order to address all
these inconsistencies, I have created a draft enhancement proposal
that describes everything I have found and how we can fix it once and
for all.

I'd like to hear your opinion and thoughts on it. Please take a look:
https://docs.google.com/document/d/1j4J3bPWjQkAU9x4G-zxKObxPrKg36jLRT6xpUoNJa8Q


-- 
Maxim Muzafarov


Apache Cassandra 5.0 documentation

2023-01-25 Thread Lorina Poland
Greetings!

I'm gearing up to help get the Cassandra 5.0 docs in good order before the
GA release occurs later this year. Recently, I've been thinking about a
more standardized organization to docs, to make it simpler for users to
find what they are looking for, separate from searching. [That's the kind
of thing docs nerds think about.] To that end, I've created a unified
information architecture (IA) that can apply to any kind of documentation,
including the Apache C* docs.

Up front, I'll say, not every section of this organization applies to
Apache C* docs, but reorganizing the docs to follow this pattern as much as
possible will help users find what they need.

I'd like your input into this IA that I've outlined. Please give me
feedback about your opinions! If I can tackle this issue before launching
into adding CEP features, working down the existing JIRA tickets for
documentation, and backfilling missing items, it would be immensely
helpful. No opinion will go unaddressed, so please take a few minutes to
take a look.

I'm linking a google doc, to make it easy for anyone to make comments:
https://docs.google.com/document/d/1A96K73vj9MbJoD7wJNgIKWrOkLq-ZL2cNZAEXSWrciY/edit?usp=sharing

I'm also drafting an Apache C* 5.0 Doc Plan for the work, to make it simple
for anyone to know what is being done, and will share that next. In
addition, I've started consolidating the current Documentation tickets that
are open under the JIRA project, component "Documentation".

Thanks,
Lorina Poland