Hi Mark, I’ve noticed some voluntarily quick reverts, which is awesome, but other times CI has stayed broken for days, so it doesn’t feel like we’re all on the same page. I cannot find anything in the wiki or dev list archives to suggest this was “settled” previously, but please share a link if I missed something. Consensus that "quick reverts are good” sounds nice, but the details of who can notice and initiate the revert make a difference (do we really expect every contributor to stare at CI for the next 10 hours after their PR is merged?).
> On Apr 29, 2020, at 8:12 PM, Mark Hanson <mhan...@pivotal.io> wrote: > > As far as I am aware, I think this is already a settled decision. The > decision was quick revert. > In almost every case the build is more important than the change. > > Thanks, > Mark > >> On Apr 29, 2020, at 4:14 PM, Robert Houghton <rhough...@pivotal.io> wrote: >> >> I am very pro-revert for breaking changes. The Benchmark or Windows tests >> being a culprit is unfortunate, since the PR pipeline cannot tell us about >> those, but thats the cost of our excellence :) >> >> On Wed, Apr 29, 2020 at 4:06 PM Owen Nichols <onich...@pivotal.io >> <mailto:onich...@pivotal.io>> wrote: >> >>> There are many ways a commit can break the Geode CI pipeline[1] despite >>> our required PR checks, such as: >>> - merge a PR that failed some non-required PR checks >>> - merge a PR that last ran checks awhile ago and now interacts >>> unexpectedly with other recent changes >>> - merge a PR that fails Windows tests >>> - merge a PR that fails Benchmarks tests >>> >>> When a bad commit gets through for whatever reason, what should happen >>> next? For example: >>> - bring it up on the dev list >>> - someone should revert the change as soon as possible, allowing the >>> pipeline to remain green while the issue is addressed >>> - a reasonable amount of time should be allowed for someone to make a >>> quick fix, otherwise revert. >>> - the pipeline should remain broken for as long as it takes to fix, as >>> long as there is clear communication that someone is working on it >>> - everyone look the other way and hope it fixes itself >>> >>> I’m sure there are other ideas as well. A related question is *who* can >>> or should revert a bad commit? Only the person that merged the PR? Only >>> the original author of the PR? The first person to notice the problem? >>> >>> What is a reasonable amount of time for this to happen? 2 hours? 2 days? >>> 2 weeks? Does it depend whether the bad commit is also affecting PR checks >>> for everyone else vs only depriving downstream consumers of new Geode >>> -SNAPSHOTs? >>> >>> Would you take offense if someone else reverted your commit? Does is make >>> a difference if your commit is exposing an existing issue (as opposed to >>> introducing a new bug)? >>> >>> Is there a perception that reverts create a lot of extra work? (they >>> shouldn’t--just start your rework PR with a revert of the revert, then add >>> additional commits that resolve the issue, so reviewers can easily see what >>> was missing the first time) >>> >>> This is a discussion thread, not a vote. We trust committers to do what’s >>> best. Would embracing a “anyone can revert, no shame” >>> revert-first-then-fix culture benefit our community, or is our current >>> easygoing approach working just fine? >>> >>> [1] >>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-develop-main&data=02%7C01%7Chansonm%40vmware.com%7C631cbe6f65c14bbb030108d7ec931267%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637237988725318710&sdata=2ecuh535cj7G9i8STCT32zyunsnrcupg4c8dowrGwvo%3D&reserved=0 >>> >>> <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-develop-main&data=02%7C01%7Chansonm%40vmware.com%7C631cbe6f65c14bbb030108d7ec931267%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637237988725318710&sdata=2ecuh535cj7G9i8STCT32zyunsnrcupg4c8dowrGwvo%3D&reserved=0>