I want the check to stay required for PR merges to be allowed. I also want
it to do real work in all cases. If I can add whitespace to 25 test classes
and get a free pass, thats a big testing hole.
Remove the limit. Worst case, we have to A) bump timeouts in the future B)
put the limits back in C) come up with a new paradigm for verifying test
quality in a non-deterministic process
On Tue, Mar 10, 2020 at 8:58 PM Dan Smith wrote:
> I checked, and 1% of our last 500 commits touched more than 30 tests. Of
> those 1%, over half touched more than 100 tests. So I'd guess somewhere
> around .5% of prs will fall under the proposed changes.
>
> My preference would be to just raise the threshold given that we already
> raised the timeout. That'll catch 99% of prs without making test
> refactoring and cleanups that touch many tests hard.
>
> But, I'm ok with removing the limit if people really feel that a manual
> override process is better for that 1%.
>
> -Dan
> On Tue, Mar 10, 2020, 4:33 PM Owen Nichols wrote:
>
> > To recap, the original question was: should we change StressNewTest to
> > pass ONLY if it is able to successfully run all new/changed tests 50
> times
> > (eliminating the existing loophole exempting PRs that touch 25 or more
> > files)?
> >
> > So far a handful of people have expressed support, but many have
> > questions/concerns:
> > - One concern was mis-counting of test files touched. As of today this
> is
> > now fixed.
> > - A big concern was that it might become more difficult to pass all
> > required PR checks.
> > - Another concern was that the current timeout of 6 hours might be too
> > high / too low.
> > - An alternative was suggested to keep the loophole but maybe increase
> the
> > threshold required to get the free pass.
> > - If we’re going to raise the bar on any required PR check, maybe we
> > should consider making it non-required.
> >
> > Let’s extend the comment period until the end of next week (Friday Mar
> 20)
> > in hopes of converging on a solution everybody is happy with (even if it
> > isn’t what was originally proposed).
> >
> >
> > > On Mar 6, 2020, at 4:51 PM, Donal Evans wrote:
> > >
> > > With fairly apt timing, we've recently had a PR merged (
> > > https://github.com/apache/geode/pull/4745) that introduced a test that
> > has
> > > resulted in fairly consistently flaky failures in the pipeline (JIRA
> > > ticket:
> https://issues.apache.org/jira/projects/GEODE/issues/GEODE-7843
> > ).
> > > The PR was quite large and touched/created a lot of tests, so
> > StressNewTest
> > > never ran on it:
> > > https://concourse.apachegeode-ci.info/builds/136784#L5e3ac3fc:2. Given
> > how
> > > frequently the test is failing in the pipeline (11 failures on various
> > > IntegrationTest jobs over the past 6 commits), it seems very likely
> that
> > > had StressNewTest been allowed to run, this issue would have been
> caught
> > at
> > > the PR stage and could have been remedied then, instead of resulting
> in a
> > > nearly constantly red pipeline.
> > >
> > > I've already given my +1 to this proposal, but this situation has
> > prompted
> > > me to make it a +2.
> > >
> > > On Fri, Mar 6, 2020 at 1:46 PM Donal Evans wrote:
> > >
> > >> With fairly apt timing, we've recently had a PR merged (
> > >> https://github.com/apache/geode/pull/4745) that introduced a test
> that
> > >> has resulted in fairly consistently flaky failures in the pipeline
> (JIRA
> > >> ticket:
> https://issues.apache.org/jira/projects/GEODE/issues/GEODE-7843
> > ).
> > >> The PR was quite large and touched/created a lot of tests, so
> > StressNewTest
> > >> never ran on it:
> > >> https://concourse.apachegeode-ci.info/builds/136784#L5e3ac3fc:2.
> Given
> > >> how frequently the test is failing in the pipeline (11 failures on
> > various
> > >> IntegrationTest jobs over the past 6 commits), it seems very likely
> that
> > >> had StressNewTest been allowed to run, this issue would have been
> > caught at
> > >> the PR stage and could have been remedied then, instead of resulting
> in
> > a
> > >> nearly constantly red pipeline.
> > >>
> > >> I've already given my +1 to this proposal, but this situation has
> > prompted
> > >> me to make it a *+1*.
> > >>
> > >> On Tue, Mar 3, 2020 at 2:08 PM Anilkumar Gingade >
> > >> wrote:
> > >>
> > >>> The stress test is to identify the flaky-ness within the test; and
> > >>> assuming
> > >>> any changes to the test may have introduced the flaky-ness.
> > >>> It's about paying the cost upfront or later when the test is
> > determined to
> > >>> be flaky.
> > >>> If 25+ tests have been changed in a PR, the cost of running stress
> test
> > >>> for
> > >>> those; and gating the PR for so long.
> > >>> Knowing how much pain it causes to fix a flaky test after a
> > certain/long
> > >>> duration of time; I am +1 for doing this change.
> > >>>
> > >>> On Tue, Mar 3, 2020 at 10:06 AM Dan Smith wrote:
> > >>>
> > What's the current timeout for StressNewTest? Mayb