+1 for a quick revert from me. Preferably by the person whose commit is causing the issue, since I think it's healthy to encourage a stigma-free revert process and a sense of individual responsibility for the health of the pipeline, but theoretically by anyone, as long there's a concrete idea of who has ownership of the problem following the revert.
As far as the (perceived?) pain of reverting, my recent personal experience with reverting a troublesome commit has made me even more in favour of doing it. I hit an unanticipated test ordering issue in the Windows unit tests after merging a commit and thanks to good communication and coordination from people who were monitoring the pipeline was able to get it reverted within 2 hours. The fix was very straightforward, and I was tempted to leave the problem commit in place and just apply the fix on top of it, but by swiftly reverting the commit that was exposing the problem, I stopped several subsequent commits from other committers from getting blocked in the brief time it took to get the fix verified and merged. The extra work involved consisted of a few messages asking for PR approvals and 30 seconds in git, so in this instance, at least, it's difficult for me to see a justification in leaving the pipeline red while working on a fix. On Wed, Apr 29, 2020 at 4:14 PM Robert Houghton <rhough...@pivotal.io> wrote: > I am very pro-revert for breaking changes. The Benchmark or Windows tests > being a culprit is unfortunate, since the PR pipeline cannot tell us about > those, but thats the cost of our excellence :) > > On Wed, Apr 29, 2020 at 4:06 PM Owen Nichols <onich...@pivotal.io> wrote: > > > There are many ways a commit can break the Geode CI pipeline[1] despite > > our required PR checks, such as: > > - merge a PR that failed some non-required PR checks > > - merge a PR that last ran checks awhile ago and now interacts > > unexpectedly with other recent changes > > - merge a PR that fails Windows tests > > - merge a PR that fails Benchmarks tests > > > > When a bad commit gets through for whatever reason, what should happen > > next? For example: > > - bring it up on the dev list > > - someone should revert the change as soon as possible, allowing the > > pipeline to remain green while the issue is addressed > > - a reasonable amount of time should be allowed for someone to make a > > quick fix, otherwise revert. > > - the pipeline should remain broken for as long as it takes to fix, as > > long as there is clear communication that someone is working on it > > - everyone look the other way and hope it fixes itself > > > > I’m sure there are other ideas as well. A related question is *who* can > > or should revert a bad commit? Only the person that merged the PR? Only > > the original author of the PR? The first person to notice the problem? > > > > What is a reasonable amount of time for this to happen? 2 hours? 2 days? > > 2 weeks? Does it depend whether the bad commit is also affecting PR > checks > > for everyone else vs only depriving downstream consumers of new Geode > > -SNAPSHOTs? > > > > Would you take offense if someone else reverted your commit? Does is > make > > a difference if your commit is exposing an existing issue (as opposed to > > introducing a new bug)? > > > > Is there a perception that reverts create a lot of extra work? (they > > shouldn’t--just start your rework PR with a revert of the revert, then > add > > additional commits that resolve the issue, so reviewers can easily see > what > > was missing the first time) > > > > This is a discussion thread, not a vote. We trust committers to do > what’s > > best. Would embracing a “anyone can revert, no shame” > > revert-first-then-fix culture benefit our community, or is our current > > easygoing approach working just fine? > > > > [1] > > > https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main >