I think repeated tests shouldn’t be a blocker to merging for the reasons 
outlined below. A committer that is a good steward for the project should be 
allowed to make the judgement call when merging a PR. We have placed too many 
rigid processes in place that eliminated the good judgement of committers. If 
we think committers lack good judgment and need these rigid checks then perhaps 
we should be reconsidering who we allow as committers.

-Jake


> On Jun 8, 2021, at 9:33 AM, Kirk Lund <kl...@apache.org> wrote:
> 
> Our requirement for stress-new-test-openjdk11 to pass before allowing merge
> doesn't really work as intended for fixing distributed tests that contain
> multiple flaky test methods. In fact, I think it causes contributors to
> avoid tackling flaky tests.
> 
> I've been working on GEODE-9103: CI Failure:
> PutAllClientServerDistributedTest.testPutAllReturnsExceptions FAILED
> <https://issues.apache.org/jira/browse/GEODE-9103> and was able to fix it.
> 
> However, stress-new-test-openjdk11 then continued to fail for other flaky
> tests in PutAllClientServerDistributedTest. I managed to fix GEODE-9296 and
> GEODE-8528 as well. I also tried but have not been able to fix GEODE-9242
> which remains flaky.
> 
> Unfortunately, I cannot merge any of my fixes for
> PutAllClientServerDistributedTest unless every single flaky test in it is
> fixed by my PR. I think this is undesirable because it would be better to
> merge the fix for 3 flaky test methods than none.
> 
> UPDATE: After running my precheckin more times, I did get
> stress-new-test-openjdk11 to pass once so I can merge, but that's more of a
> loophole than anything because I didn't manage to fix GEODE-9242.
> 
> Despite having PR #6542 eventually pass, I would like to discuss removing
> or relaxing our requirement that stress-new-test-openjdk11 must pass in
> order to merge a PR...
> 
> PR #6542: GEODE-9103: Fix ServerConnectivityExceptions in
> PutAllClientServerDistributedTest
> <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fpull%2F6542&amp;data=04%7C01%7Cjabarrett%40vmware.com%7C010c8c9b76a84e44ee2f08d92a9b2a33%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C637587668189086895%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=GlEh8TaK8eQH69NcD8bUKYe7hTl9LCdgc%2BVxwDFeE6U%3D&amp;reserved=0>
> 
> Fixed in PR #6542:
> * GEODE-9296: CI Failure: PutAllClientServerDistributedTest >
> testPartialKeyInPRSingleHopWithRedundancy
> <https://issues.apache.org/jira/browse/GEODE-9296>
> * GEODE-9103: CI Failure:
> PutAllClientServerDistributedTest.testPutAllReturnsExceptions FAILED
> <https://issues.apache.org/jira/browse/GEODE-9103>
> * GEODE-8528: PutAllClientServerDistributedTest.testPartialKeyInPRSingleHop
> fails due to missing disk store after server restart
> <https://issues.apache.org/jira/browse/GEODE-8528>
> 
> Still flaky:
> * GEODE-9242: CI failure in PutAllClientServerDistributedTest >
> testEventIdOutOfOrderInPartitionRegionSingleHop
> <https://issues.apache.org/jira/browse/GEODE-9242>
> 
> Thanks,
> Kirk

Reply via email to