> > I don’t think means guaranteeing there are no failing tests (though > ideally this would also happen), but about ensuring our best practices are > followed for every merge. 4.0 took so long to release because of the amount > of hidden work that was created by merging work that didn’t meet the > standard for release. >
Tests are sometimes considered flaky because they fail intermittently but it may not be related to the insufficiently consistent test implementation and can reveal some real problem in the production code. I saw that in various codebases and I think that it would be great if each such test (or test group) was guaranteed to have a ticket and some preliminary analysis was done to confirm it is just a test problem before releasing the new version Historically we have also had significant pressure to backport features to > earlier versions due to the cost and risk of upgrading. If we maintain > broader version compatibility for upgrade, and reduce the risk of adopting > newer versions, then this pressure is also reduced significantly. Though > perhaps we will stick to our guns here anyway, as there seems to be renewed > pressure to limit work in GA releases to bug fixes exclusively. It remains > to be seen if this holds. Are there any precise requirements for supported upgrade and downgrade paths? Thanks - - -- --- ----- -------- ------------- Jacek Lewandowski On Sat, Oct 30, 2021 at 4:07 PM bened...@apache.org <bened...@apache.org> wrote: > > How do we define what "releasable trunk" means? > > For me, the major criteria is ensuring that work is not merged that is > known to require follow-up work, or could reasonably have been known to > require follow-up work if better QA practices had been followed. > > So, a big part of this is ensuring we continue to exceed our targets for > improved QA. For me this means trying to weave tools like Harry and the > Simulator into our development workflow early on, but we’ll see how well > these tools gain broader adoption. This also means focus in general on > possible negative effects of a change. > > I think we could do with producing guidance documentation for how to > approach QA, where we can record our best practices and evolve them as we > discover flaws or pitfalls, either for ergonomics or for bug discovery. > > > What are the benefits of having a releasable trunk as defined here? > > If we want to have any hope of meeting reasonable release cadences _and_ > the high project quality we expect today, then I think a ~shippable trunk > policy is an absolute necessity. > > I don’t think means guaranteeing there are no failing tests (though > ideally this would also happen), but about ensuring our best practices are > followed for every merge. 4.0 took so long to release because of the amount > of hidden work that was created by merging work that didn’t meet the > standard for release. > > Historically we have also had significant pressure to backport features to > earlier versions due to the cost and risk of upgrading. If we maintain > broader version compatibility for upgrade, and reduce the risk of adopting > newer versions, then this pressure is also reduced significantly. Though > perhaps we will stick to our guns here anyway, as there seems to be renewed > pressure to limit work in GA releases to bug fixes exclusively. It remains > to be seen if this holds. > > > What are the costs? > > I think the costs are quite low, perhaps even negative. Hidden work > produced by merges that break things can be much more costly than getting > the work right first time, as attribution is much more challenging. > > One cost that is created, however, is for version compatibility as we > cannot say “well, this is a minor version bump so we don’t need to support > downgrade”. But I think we should be investing in this anyway for operator > simplicity and confidence, so I actually see this as a benefit as well. > > > Full disclosure: running face-first into 60+ failing tests on trunk > > I have to apologise here. CircleCI did not uncover these problems, > apparently due to some way it resolves dependencies, and so I am > responsible for a significant number of these and have been quite sick > since. > > I think a push to eliminate flaky tests will probably help here in future, > though, and perhaps the project needs to have some (low) threshold of flaky > or failing tests at which point we block merges to force a correction. > > > From: Joshua McKenzie <jmcken...@apache.org> > Date: Saturday, 30 October 2021 at 14:00 > To: dev@cassandra.apache.org <dev@cassandra.apache.org> > Subject: [DISCUSS] Releasable trunk and quality > We as a project have gone back and forth on the topic of quality and the > notion of a releasable trunk for quite a few years. If people are > interested, I'd like to rekindle this discussion a bit and see if we're > happy with where we are as a project or if we think there's steps we should > take to change the quality bar going forward. The following questions have > been rattling around for me for awhile: > > 1. How do we define what "releasable trunk" means? All reviewed by M > committers? Passing N% of tests? Passing all tests plus some other metrics > (manual testing, raising the number of reviewers, test coverage, usage in > dev or QA environments, etc)? Something else entirely? > > 2. With a definition settled upon in #1, what steps, if any, do we need to > take to get from where we are to having *and keeping* that releasable > trunk? Anything to codify there? > > 3. What are the benefits of having a releasable trunk as defined here? What > are the costs? Is it worth pursuing? What are the alternatives (for > instance: a freeze before a release + stabilization focus by the community > i.e. 4.0 push or the tock in tick-tock)? > > Given the large volumes of work coming down the pike with CEP's, this seems > like a good time to at least check in on this topic as a community. > > Full disclosure: running face-first into 60+ failing tests on trunk when > going through the commit process for denylisting this week brought this > topic back up for me (reminds me of when I went to merge CDC back in 3.6 > and those test failures riled me up... I sense a pattern ;)) > > Looking forward to hearing what people think. > > ~Josh >