Re: Merging CEP-15 to trunk

Henrik Ingo Wed, 25 Jan 2023 07:23:08 -0800

Hi Josh

I chose to mainly reply to Benedict's latest email as a reply to both of
you, but came back here only for a single higher level comment:


I'm not aware of the project history of such mega reviews, other than years
later, indirectly I have maybe felt the impact to quality that such large
commits (late in the release cycle...) typically have on a complex code
base. But sure, I can see how that could have happened and understand it
would then impact this discussion too.

Fundamentally we have two conflicting interests at play:
 - merging smaller incremental changes is preferable to large merges
 - merging incomplete work is usually a bad idea, it's better to work on a
feature branch until some kind of mvp or v1 level of functionality is met.

We'll just have to learn to balance both of these.

Related to the history though, one thing that has changed is the
introduction of CEPs. I actually expect this to make a huge difference
compared to historical traumatic experiences. Because at this point we
definitely do not need to discuss a) whether we want transactions in the
first place, or b) how they should (have) be(en)  implemented.

Also, the fact that code going into the feature branch was already reviewed
with the same rigor as a trunk merge would, should of course make a big
difference too. So in summary, I'm optimistic that the processes we are
following today will work better than what was in the past. (If anything,
the CI worries me a lot more than the review process. If we had an
automated merge pipeline that would do the rebase-test-merge in an
automated and uncompromising way, I bet even this discussion would have
been more relaxed.)

henrik

On Wed, Jan 25, 2023 at 12:11 AM Josh McKenzie <jmcken...@apache.org> wrote:

> Cordial debate! <3
>
> - it's nevertheless the case that those contributors who didn't actively
> work on Accord, have assumed that they will be invited to review now, when
> the code is about to land in trunk. Not allowing that to happen would make
> them feel like they weren't given the opportunity and that the process in
> Cassandra Project Governance was bypassed. We can agree to work differently
> in the future, but this is the reality now.
>
> If this was a miscommunication on this instance rectifying it will of
> course require compromise from all parties. Good learning for future
> engagement and hopefully the outcome of this discussion is clearer norms as
> a project so we don't end up with this miscommunication in the future.
>
> the code is of the highest quality and ready to be merged to trunk, I
> don't think that can be expected of every feature branch all the time
>
> I think this is something we can either choose to make a formal
> requirement for feature branches in ASF git (all code that goes in has 2
> committers hands on) or not. If folks want to work on other feature
> branches in other forks w/out this bar and then have a "mega review" at the
> end, I suppose that's their prerogative. Many of us that have been on the
> project for years have _significant emotional and psychological scars_ from
> that approach however, and multiple large efforts have failed at the
> "mega-review and merge" step. So I wouldn't advocate for that approach (and
> it's the only logical alternative I can think of to incremental bar of
> quality reinforcement throughout a work cycle on a large feature over time).
>
> if it had been ready to merge to trunk already a year ago, why wasn't it?
> It's kind of the point of using a feature branch that the code in it is NOT
> ready to be merged yet
>
> Right now we culturally tend to avoid merging code that doesn't do
> anything, for better or worse. We don't have a strong culture of either
> incremental merge in during development or of using the experimental flag
> for new features. Much of the tightly coupled nature of our codebase makes
> this a necessity for keeping velocity while working unfortunately. So in
> this case I would qualify that "it's not ready to be merged yet given our
> assumption that all code in the codebase should serve an active immediate
> purpose, not due to a lack of merge-level quality".
>
> The approach of "hold the same bar for merges into a feature branch as
> trunk" seems to be a compromise between Big Bang single commit drops and
> peppering trunk with a lot of "as yet dormant" incremental code as a large
> feature is built out. Not saying it's better or worse, just describing the
> contour of the tradeoffs as I see them.
>
> - Uncertainty: It's completely ok that some feature branches may be
> abandoned without ever merging to trunk. Requiring the community (anyone
> potentially interested, anyways) to review such code would obviously be a
> waste of precious talent.
>
> This is an excellent point. The only mitigation I'd see for this would be
> an additional review period or burden collectively before merge of a
> feature branch into trunk once something has crossed a threshold of success
> as to be included, or stepping away from a project where you don't have the
> cycles to stay up to date and review and trust that the other committers
> working on the project are making choices that are palatable and acceptable
> to you.
>
> If all API decisions hit the dev ML and the architecture conforms
> generally to the specification of the CEP, it seems to me that stepping
> back and trusting your fellow committers to Do The Right Thing is the
> optimal (and scalable) approach here?
>
> Let's say someone in October 2021 was invested in the quality of Cassandra
> 4.1 release. Should this person now invest in reviewing Accord or not? It's
> impossible to know. Again, in hindsight we know that the answer is no, but
> your suggestion again would require the person to review all active feature
> branches just in case.
>
> I'd argue that there's 3 times to really invest in the quality of any
> Cassandra release:
> 1. When we set agreed upon bars for quality we'll all hold ourselves
> accountable to (CI, code style, test coverage, etc)
> 2. When we raise new committers
> 3. When we write or review code
>
> I don't think it's sustainable to try and build processes that will
> bottleneck our throughput as a community to the finite availability of
> individuals if they're concerned about specific topics and want to be
> individually involved in specific code level changes. If folks feel like
> our current processes, CI infrastructure, and committer pool risks
> maintaining our bar of quality that's definitely something we should talk
> about in depth, as in my mind that's the backbone of us scaling stably as a
> project community.
>
>
> On Tue, Jan 24, 2023, at 4:41 PM, Henrik Ingo wrote:
>
> Thanks Josh
>
> Since you mentioned the CEP process, I should also mention one sentiment
> you omitted, but worth stating explicitly:
>
> 4. The CEP itself should not be renegotiated at this point. However, the
> reviewers should rather focus on validating that the implementation matches
> the CEP. (Or if not, that the deviation is of a good reason and the
> reviewer agrees to approve it.)
>
>
> Although I'm not personally full time working on either producing
> Cassandra code or reviewing it, I'm going to spend one more email defending
> your point #1, because I think your proposal would lead to a lot of
> inefficiencies in the project, and that does happen to be my job to care
> about:
>
>  - Even if you could be right, from some point of view, it's nevertheless
> the case that those contributors who didn't actively work on Accord, have
> assumed that they will be invited to review now, when the code is about to
> land in trunk. Not allowing that to happen would make them feel like they
> weren't given the opportunity and that the process in Cassandra Project
> Governance was bypassed. We can agree to work differently in the future,
> but this is the reality now.
>
>  - Although those who have collaborated on Accord testify that the code is
> of the highest quality and ready to be merged to trunk, I don't think that
> can be expected of every feature branch all the time. In fact, I'm pretty
> sure the opposite must have been the case also for the Accord branch at
> some point. After all, if it had been ready to merge to trunk already a
> year ago, why wasn't it? It's kind of the point of using a feature branch
> that the code in it is NOT ready to be merged yet. (For example, the
> existing code might be of high quality, but the work is incomplete, so it
> shouldn't be merged to trunk.)
>
>  - Uncertainty: It's completely ok that some feature branches may be
> abandoned without ever merging to trunk. Requiring the community (anyone
> potentially interested, anyways) to review such code would obviously be a
> waste of precious talent.
>
>  - Time uncertainty: Also - and this is also true for Accord - it is
> unknown when the merge will happen if it does. In the case of Accord it is
> now over a year since the CEP was adopted. If I remember correctly an
> initial target date for some kind of milestone may have been Summer of
> 2022? Let's say someone in October 2021 was invested in the quality of
> Cassandra 4.1 release. Should this person now invest in reviewing Accord or
> not? It's impossible to know. Again, in hindsight we know that the answer
> is no, but your suggestion again would require the person to review all
> active feature branches just in case.
>
>
> As for 2 and 3, I certainly observe an assumption that contributors have
> expected to review after a rebase. But I don't see this as a significant
> topic to argue about. If indeed the rebase is as easy as Benedict
> advertised, then we should just do the rebase because apparently it can be
> done faster than it took me to write this email :-) (But yes, conversely,
> it seems then that the rebase is not a big reason to hold off from
> reviewing either.)
>
> henrik
>
>
> On Tue, Jan 24, 2023 at 9:29 PM Josh McKenzie <jmcken...@apache.org>
> wrote:
>
>
> Zooming out a bit, I think Accord is the first large body of work we've
> done post introduction of the CEP system with multiple people collaborating
> on a feature branch like this. This discussion seems to have surfaced a few
> sentiments:
>
> 1. Some contributors seem to feel that work on a feature branch doesn't
> have the same inherent visibility as work on trunk
> 2. There's a lack of clarity on our review process when it comes to
> significant (in either time or size) rebases
> 3. We might be treating Ninja commits a bit differently on a feature
> branch than trunk, which intersects with 1 and 2
>
> My personal opinions are:
> I disagree with 1 - it simply takes the added effort of actively following
> that branch and respective JIRAs if you're interested. I think having a
> feature branch in the ASF git repo w/commits and JIRAs tracking that work
> is perfectly fine, and the existing bar (2 committers +1, green tests
> before merge to trunk) when applied to a feature branch is also not just
> well within the "letter of the law" on the project but also logically
> sufficient to retain our bar of quality and stability.
>
> For 2 (reviews required after rebase) I don't think we should
> over-prescribe process on this. If all tests are green pre-rebase, and all
> tests are green post-rebase, and a committer is confident they didn't
> materially modify the functioning of the logical flow or data structures of
> their code during a rebase, I don't see there being any value added by
> adding another review based on those grounds.
>
> If the subtext is actually that some folks feel we need a discussion about
> whether we should have a different bar for review on CEP feature branches
> (3 committers? 1+ pmc members? more diversity in reviewers or committers as
> measured by some as yet unspoken metric), perhaps we could have that
> discussion. FWIW I'm against changes there as well; we all wear our Apache
> Hats here, and if the debate is between work like this happening in a
> feature branch affording contributors increased efficiency and locality vs.
> all that happening on trunk and repeatedly colliding with everyone
> everywhere, feature branches are a clear win IMO.
>
> And for 3 - I think we've all broadly agreed we shouldn't ninja commit
> unless it's a comment fix, typo, forgotten git add, or something along
> those lines. For any commit that doesn't qualify it should go through the
> review process.
>
> And a final note - Ekaterina alluded to something valuable in her email
> earlier in this thread. I think having a "confirm green on all the test
> suites that are green on merge target" bar for large feature branches
> (rather than strictly the "pre-commit subset") before merge makes a lot of
> sense.
>
> On Tue, Jan 24, 2023, at 1:41 PM, Caleb Rackliffe wrote:
>
> Just FYI, I'm going to be posting a Jira (which will have some
> dependencies as well) to track this merge, hopefully some time today...
>
> On Tue, Jan 24, 2023 at 12:26 PM Ekaterina Dimitrova <
> e.dimitr...@gmail.com> wrote:
>
> I actually see people all the time making a final check before merge as
> part of the review. And I personally see it only as a benefit when it comes
> to serious things like Accord, as an example. Even as a help for the author
> who is overwhelmed with the big amount of work already done - someone to do
> quick last round of review. Team work after all.
>
> Easy rebase - those are great news. I guess any merge conflicts that were
> solved will be documented and confirmed with reviewers before merge on the
> ticket where the final CI push will be posted. I also assumed that even
> without direct conflicts a check that there is no contradiction with any
> post-September commits is done as part of the rebase. (Just adding here for
> completeness)
>
> One thing that I wanted to ask for is when you push to CI, you or whoever
> does it, to approve all jobs. Currently we have pre-approved the minimum
> required jobs in the pre-commit workflow. I think in this case with a big
> work approving all jobs in CircleCI will be of benefit. (I also do it for
> bigger bodies of work to be on the safe side) Just pointing in case you use
> a script or something to push only the pre-approved ones. Please ping me in
> Slack if It’s not clear what I mean, happy to help with that
>
> On Tue, 24 Jan 2023 at 11:52, Benedict <bened...@apache.org> wrote:
>
>
> Perhaps the disconnect is that folk assume a rebase will be difficult and
> have many conflicts?
>
> We have introduced primarily new code with minimal integration points, so
> I decided to test this. I managed to rebase locally in around five minutes;
> mostly imports. This is less work than for a rebase of fairly typical
> ticket of average complexity.
>
> Green CI is of course a requirement. There is, however, no good procedural
> or technical justification for a special review of the rebase.
>
> Mick is encouraged to take a look at the code before and after rebase, and
> will be afforded plenty of time to do so. But I will not gate merge on this
> adhoc requirement.
>
>
>
>
> On 24 Jan 2023, at 15:40, Ekaterina Dimitrova <e.dimitr...@gmail.com>
> wrote:
>
> 
>
> Hi everyone,
> I am excited to see this work merged. I noticed the branch is 395 commits
> behind trunk or not rebased since September last year. I think if Mick or
> anyone else wants to make a final pass after rebase happens and CI runs -
> this work can only benefit of that. Squash, rebase and full CI run green is
> the minimum that, if I read correctly the thread, we all agree on that
> part.
> I would say that CI and final check after a long rebase of a patch is a
> thing we actually do all the time even for small patches when we get back
> to our backlog of old patches. This doesn’t mean that the previous reviews
> are dismissed or people not trusted or anything like that.
> But considering the size and the importance of this work, I can really see
> only benefit of a final cross-check.
> As Henrik mentioned me, I am not sure I will have the chance to review
> this work any time soon (just setting the right expectations up front) but
> I see at least Mick already mentioning he would do it if there are no other
> volunteers. Now, whether it will be separate ticket or not, that is a
> different story. Aren’t the Accord tickets in an epic under which we can
> document the final rebase, CI runs, etc?
>
> On Tue, 24 Jan 2023 at 9:40, Henrik Ingo <henrik.i...@datastax.com> wrote:
>
> When was the last time the feature branch was rebased? Assuming it's a
> while back and the delta is significant, surely it's normal process to
> first rebase, run tests, and then present the branch for review?
>
> To answer your question: The effect of the rebase is then either baked
> into the original commits (which I personally dislike), or you can also
> have the rebase-induced changes as their own commits. (Which can get
> tedious, but has the benefit of making explicit what was only a change due
> to rebasing.) Depending on which approach you take when rebasing, a
> reviewer would then review accordingly.
>
> henrik
>
> On Tue, Jan 24, 2023 at 11:14 AM Benedict <bened...@apache.org> wrote:
>
>
> No, that is not the normal process. What is it you think you would be
> reviewing? There are no diffs produced as part of rebasing, and the purpose
> of review is to ensure code meets out standards, not that the committer is
> competent at rebasing or squashing. Nor are you familiar with the code as
> it was originally reviewed, so would have nothing to compare against. We
> expect a clean CI run, ordinarily, not an additional round of review. If we
> were to expect that, it would be by the original reviewer, not a third
> party, as they are the only ones able to judge the rebase efficiently.
>
> I would support investigating tooling to support reviewing rebases. I’m
> sure such tools and processes exist. But, we don’t have them today and it
> is not a normal part of the review process. If you want to modify, clarify
> or otherwise stipulate new standards or processes, I suggest a separate
> thread.
>
> > How will the existing tickets make it clear when and where their final
> merge happened?
>
> By setting the release version and source control fields.
>
>
>
>
> On 24 Jan 2023, at 08:43, Mick Semb Wever <m...@apache.org> wrote:
>
> 
>
> .... But it's not merge-than-review, because they've already been
> reviewed, before being merged to the feature branch, by committers
> (actually PMC members)?
>
> You want code that's been written by one PMC member and reviewed by 2
> other PMC members to be put up for review by some random 4th party? For how
> long?
>
>
>
> It is my hope that the work as-is is not being merged. That there is a
> rebase and some trivial squashing to do. That deserves a quick check by
> another. Ideally this would be one of the existing reviewers (but like any
> other review step, no matter how short and trivial it is, that's still an
> open process). I see others already doing this when rebasing larger patches
> before the final merge.
>
> Will the branch be rebased and cleaned up?
> How will the existing tickets make it clear when and where their final
> merge happened?
>
>
>
>
> --
>
>
>
> *Henrik Ingo*
>
> *c*. +358 40 569 7354
>
> *w*. *www.datastax.com <http://www.datastax.com>*
>
>
> <https://urldefense.com/v3/__https://www.facebook.com/datastax__;!!PbtH5S7Ebw!ep4b-HH4HnBcGPT32sQbstaimEP5eIigJGvIpXgHKHxWq4uyqmNUiaz6DwjozGhRlQX9M2F7yZrdLA1y1UUJDw$>
> <https://twitter.com/datastax>
> <https://urldefense.com/v3/__https://www.linkedin.com/company/datastax/__;!!PbtH5S7Ebw!ep4b-HH4HnBcGPT32sQbstaimEP5eIigJGvIpXgHKHxWq4uyqmNUiaz6DwjozGhRlQX9M2F7yZrdLA2L93GKGQ$>
> <https://github.com/datastax/>
>
>
>
>
> --
>
>
>
> *Henrik Ingo*
>
> *c*. +358 40 569 7354
>
> *w*. *www.datastax.com <http://www.datastax.com>*
>
>
> <https://urldefense.com/v3/__https://www.facebook.com/datastax__;!!PbtH5S7Ebw!dJWV_AzDZ0ky_oUtbzoCx31HydV3PrJNZ4wuZp75nRa0Os5PAwgWCb5LbATBgr9xmDXxYQwWGYLDx6YqMyfa5w$>
> <https://twitter.com/datastax>
> <https://urldefense.com/v3/__https://www.linkedin.com/company/datastax/__;!!PbtH5S7Ebw!dJWV_AzDZ0ky_oUtbzoCx31HydV3PrJNZ4wuZp75nRa0Os5PAwgWCb5LbATBgr9xmDXxYQwWGYLDx6YYSDIOAw$>
> <https://github.com/datastax/>
>
>
>

-- 

Henrik Ingo

c. +358 40 569 7354

w. www.datastax.com

<https://www.facebook.com/datastax>  <https://twitter.com/datastax>
<https://www.linkedin.com/company/datastax/>  <https://github.com/datastax/>

Re: Merging CEP-15 to trunk

Reply via email to