I want the check to stay required for PR merges to be allowed. I also want it to do real work in all cases. If I can add whitespace to 25 test classes and get a free pass, thats a big testing hole.
Remove the limit. Worst case, we have to A) bump timeouts in the future B) put the limits back in C) come up with a new paradigm for verifying test quality in a non-deterministic process On Tue, Mar 10, 2020 at 8:58 PM Dan Smith <dsm...@pivotal.io> wrote: > I checked, and 1% of our last 500 commits touched more than 30 tests. Of > those 1%, over half touched more than 100 tests. So I'd guess somewhere > around .5% of prs will fall under the proposed changes. > > My preference would be to just raise the threshold given that we already > raised the timeout. That'll catch 99% of prs without making test > refactoring and cleanups that touch many tests hard. > > But, I'm ok with removing the limit if people really feel that a manual > override process is better for that 1%. > > -Dan > On Tue, Mar 10, 2020, 4:33 PM Owen Nichols <onich...@pivotal.io> wrote: > > > To recap, the original question was: should we change StressNewTest to > > pass ONLY if it is able to successfully run all new/changed tests 50 > times > > (eliminating the existing loophole exempting PRs that touch 25 or more > > files)? > > > > So far a handful of people have expressed support, but many have > > questions/concerns: > > - One concern was mis-counting of test files touched. As of today this > is > > now fixed. > > - A big concern was that it might become more difficult to pass all > > required PR checks. > > - Another concern was that the current timeout of 6 hours might be too > > high / too low. > > - An alternative was suggested to keep the loophole but maybe increase > the > > threshold required to get the free pass. > > - If we’re going to raise the bar on any required PR check, maybe we > > should consider making it non-required. > > > > Let’s extend the comment period until the end of next week (Friday Mar > 20) > > in hopes of converging on a solution everybody is happy with (even if it > > isn’t what was originally proposed). > > > > > > > On Mar 6, 2020, at 4:51 PM, Donal Evans <doev...@pivotal.io> wrote: > > > > > > With fairly apt timing, we've recently had a PR merged ( > > > https://github.com/apache/geode/pull/4745) that introduced a test that > > has > > > resulted in fairly consistently flaky failures in the pipeline (JIRA > > > ticket: > https://issues.apache.org/jira/projects/GEODE/issues/GEODE-7843 > > ). > > > The PR was quite large and touched/created a lot of tests, so > > StressNewTest > > > never ran on it: > > > https://concourse.apachegeode-ci.info/builds/136784#L5e3ac3fc:2. Given > > how > > > frequently the test is failing in the pipeline (11 failures on various > > > IntegrationTest jobs over the past 6 commits), it seems very likely > that > > > had StressNewTest been allowed to run, this issue would have been > caught > > at > > > the PR stage and could have been remedied then, instead of resulting > in a > > > nearly constantly red pipeline. > > > > > > I've already given my +1 to this proposal, but this situation has > > prompted > > > me to make it a +2. > > > > > > On Fri, Mar 6, 2020 at 1:46 PM Donal Evans <doev...@pivotal.io> wrote: > > > > > >> With fairly apt timing, we've recently had a PR merged ( > > >> https://github.com/apache/geode/pull/4745) that introduced a test > that > > >> has resulted in fairly consistently flaky failures in the pipeline > (JIRA > > >> ticket: > https://issues.apache.org/jira/projects/GEODE/issues/GEODE-7843 > > ). > > >> The PR was quite large and touched/created a lot of tests, so > > StressNewTest > > >> never ran on it: > > >> https://concourse.apachegeode-ci.info/builds/136784#L5e3ac3fc:2. > Given > > >> how frequently the test is failing in the pipeline (11 failures on > > various > > >> IntegrationTest jobs over the past 6 commits), it seems very likely > that > > >> had StressNewTest been allowed to run, this issue would have been > > caught at > > >> the PR stage and could have been remedied then, instead of resulting > in > > a > > >> nearly constantly red pipeline. > > >> > > >> I've already given my +1 to this proposal, but this situation has > > prompted > > >> me to make it a *+1*. > > >> > > >> On Tue, Mar 3, 2020 at 2:08 PM Anilkumar Gingade <aging...@pivotal.io > > > > >> wrote: > > >> > > >>> The stress test is to identify the flaky-ness within the test; and > > >>> assuming > > >>> any changes to the test may have introduced the flaky-ness. > > >>> It's about paying the cost upfront or later when the test is > > determined to > > >>> be flaky. > > >>> If 25+ tests have been changed in a PR, the cost of running stress > test > > >>> for > > >>> those; and gating the PR for so long. > > >>> Knowing how much pain it causes to fix a flaky test after a > > certain/long > > >>> duration of time; I am +1 for doing this change. > > >>> > > >>> On Tue, Mar 3, 2020 at 10:06 AM Dan Smith <dsm...@pivotal.io> wrote: > > >>> > > >>>> What's the current timeout for StressNewTest? Maybe if we just up > the > > >>>> threshold to 100 tests or so and up the timeout to match we can > catch > > >>>> pretty much all PRs. > > >>>> > > >>>> I'm not sure why the job is flagging more tests than it should. It > > looks > > >>>> like at some point @rhoughon changed it to read the merge base from > > some > > >>>> file created by concourse as an optimization [1] - I suspect maybe > > that > > >>>> file is inaccurate? > > >>>> > > >>>> I originally wrote this job. It's definitely not a panacea, it will > > only > > >>>> catch a new flaky test if > > >>>> - the test is really flaky (likely to fail more than 1/50 times) > > >>>> - the change actually happened in the test file itself, and not the > > >>>> product or some other test file. > > >>>> > > >>>> [1] > > >>>> > > >>>> > > >>> > > > https://github.com/apache/geode/commit/4c06ba4625e69d44a5165aa9f2fccddfc064de87 > > >>>> > > >>>> -Dan > > >>>> > > >>>> On Sun, Mar 1, 2020 at 9:00 PM Owen Nichols <onich...@pivotal.io> > > >>> wrote: > > >>>> > > >>>>> We don’t tend to look too closely at successful PR checks to see > > >>> whether > > >>>>> they actually checked anything at all. > > >>>>> > > >>>>> One example I found is > > >>>>> > > >>>> > > >>> > > > https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/StressNewTestOpenJDK11/builds/5957 > > >>>>> < > > >>>>> > > >>>> > > >>> > > > https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/StressNewTestOpenJDK11/builds/5957 > > >>>>>> : > > >>>>> 32 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. > > >>>>> > > >>>>> Here are 92 more examples (url’s omitted for brevity — use the > > example > > >>>>> above as a template and just replace the last 4 digits): > > >>>>> 26 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6243) > > >>>>> 26 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6249) > > >>>>> 26 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6402) > > >>>>> 27 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6262) > > >>>>> 27 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6430) > > >>>>> 27 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6439) > > >>>>> 27 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6449) > > >>>>> 27 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6454) > > >>>>> 27 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6458) > > >>>>> 27 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6459) > > >>>>> 28 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6224) > > >>>>> 28 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6441) > > >>>>> 28 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6448) > > >>>>> 28 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6452) > > >>>>> 29 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6102) > > >>>>> 29 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6177) > > >>>>> 30 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 5939) > > >>>>> 30 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 5940) > > >>>>> 30 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 5949) > > >>>>> 30 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6473) > > >>>>> 31 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 5953) > > >>>>> 31 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6187) > > >>>>> 31 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6470) > > >>>>> 31 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6471) > > >>>>> 31 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6474) > > >>>>> 31 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6475) > > >>>>> 32 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 5958) > > >>>>> 32 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6173) > > >>>>> 32 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6236) > > >>>>> 32 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6237) > > >>>>> 32 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6242) > > >>>>> 33 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6246) > > >>>>> 33 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6248) > > >>>>> 33 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6250) > > >>>>> 33 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6251) > > >>>>> 33 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6254) > > >>>>> 33 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6255) > > >>>>> 34 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6139) > > >>>>> 34 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6141) > > >>>>> 34 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6264) > > >>>>> 34 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6266) > > >>>>> 34 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6271) > > >>>>> 35 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6142) > > >>>>> 35 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6143) > > >>>>> 35 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6277) > > >>>>> 35 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6309) > > >>>>> 35 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6413) > > >>>>> 35 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6414) > > >>>>> 35 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6419) > > >>>>> 35 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6420) > > >>>>> 36 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6144) > > >>>>> 36 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6146) > > >>>>> 36 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6147) > > >>>>> 36 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6310) > > >>>>> 36 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6326) > > >>>>> 36 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6329) > > >>>>> 36 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6330) > > >>>>> 40 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6159) > > >>>>> 40 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6160) > > >>>>> 40 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6188) > > >>>>> 41 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6440) > > >>>>> 41 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6480) > > >>>>> 41 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6483) > > >>>>> 43 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6190) > > >>>>> 44 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 5768) > > >>>>> 44 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 5772) > > >>>>> 46 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6191) > > >>>>> 46 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6218) > > >>>>> 46 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6219) > > >>>>> 46 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6225) > > >>>>> 46 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6244) > > >>>>> 46 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6245) > > >>>>> 48 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6129) > > >>>>> 49 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 5804) > > >>>>> 51 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 5877) > > >>>>> 53 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 5742) > > >>>>> 54 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 5757) > > >>>>> 66 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6232) > > >>>>> 66 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6233) > > >>>>> 66 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6234) > > >>>>> 66 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6259) > > >>>>> 66 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6260) > > >>>>> 66 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6278) > > >>>>> 79 is too many changed tests to stress test. Allowing this job to > > pass > > >>>>> without stress testing. (build 6481) > > >>>>> 103 is too many changed tests to stress test. Allowing this job to > > >>> pass > > >>>>> without stress testing. (build 6493) > > >>>>> 106 is too many changed tests to stress test. Allowing this job to > > >>> pass > > >>>>> without stress testing. (build 6206) > > >>>>> 106 is too many changed tests to stress test. Allowing this job to > > >>> pass > > >>>>> without stress testing. (build 6207) > > >>>>> 115 is too many changed tests to stress test. Allowing this job to > > >>> pass > > >>>>> without stress testing. (build 5769) > > >>>>> 118 is too many changed tests to stress test. Allowing this job to > > >>> pass > > >>>>> without stress testing. (build 6226) > > >>>>> 721 is too many changed tests to stress test. Allowing this job to > > >>> pass > > >>>>> without stress testing. (build 6197) > > >>>>> 722 is too many changed tests to stress test. Allowing this job to > > >>> pass > > >>>>> without stress testing. (build 6217) > > >>>>> 722 is too many changed tests to stress test. Allowing this job to > > >>> pass > > >>>>> without stress testing. (build 6221) > > >>>>> > > >>>>> Please note, sometimes the number of files reported seems to be way > > >>>> higher > > >>>>> than it should be. For example > > >>>> https://github.com/apache/geode/pull/4691 > > >>>>> <https://github.com/apache/geode/pull/4691> shows exactly 1 test > > file > > >>>>> touched, but > > >>>>> > > >>>> > > >>> > > > https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/StressNewTestOpenJDK11/builds/6278 > > >>>>> < > > >>>>> > > >>>> > > >>> > > > https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/StressNewTestOpenJDK11/builds/6278 > > >>>>> > > >>>>> thinks it touched 66 test files. > > >>>>> > > >>>>> There’s currently no good data to predict how long StressNew will > > >>> take, > > >>>> it > > >>>>> just depends on the mix of tests touched. I am aware of at least 4 > > >>> cases > > >>>>> (with less than 25 files) where the timeout was hit. In two of > those > > >>>> cases > > >>>>> we re-ran with a longer timeout, and in two cases the PR author > split > > >>> up > > >>>>> the PR into half a dozen smaller PRs to get around it. > > >>>>> > > >>>>> > > >>>>>> On Mar 1, 2020, at 7:52 PM, Anthony Baker <aba...@pivotal.io> > > >>> wrote: > > >>>>>> > > >>>>>> What percentage of PR’s are currently subject to the 25 test file > > >>> rule? > > >>>>> How many would be subject to the concourse timeout? > > >>>>>> > > >>>>>> I’d like to understand the scope of the impact for this change. > > >>>>>> > > >>>>>> Anthony > > >>>>>> > > >>>>>> > > >>>>>>> On Mar 1, 2020, at 8:58 AM, Owen Nichols <onich...@pivotal.io> > > >>> wrote: > > >>>>>>> > > >>>>>>> Impossible, no. Inconvenient, perhaps, but a small price to pay > for > > >>>>> being > > >>>>>>> able to trust that green means green. > > >>>>>>> > > >>>>>>> With or without this proposed change, if anyone is having trouble > > >>>>> getting > > >>>>>>> their PR to pass StressNew, please bring it up on the dev list > and > > >>> we > > >>>>> can > > >>>>>>> discuss the appropriate solution on a case-by-case basis (e.g. > > >>>>> increasing > > >>>>>>> timeout, fixing the logic that identifies changed test files, > > >>>> splitting > > >>>>>>> into multiple PRs, authorizing an override, etc). > > >>>>>>> > > >>>>>>> On Sun, Mar 1, 2020 at 8:56 AM Dan Smith <dsm...@pivotal.io> > > >>> wrote: > > >>>>>>> > > >>>>>>>> Won't this make it impossible to merge refactoring changes that > > >>> touch > > >>>>> a lot > > >>>>>>>> of tests? > > >>>>>>>> > > >>>>>>>> -Dan > > >>>>>>>> > > >>>>>>>> On Sat, Feb 29, 2020, 12:37 PM Robert Houghton < > > >>> rhough...@pivotal.io > > >>>>> > > >>>>>>>> wrote: > > >>>>>>>> > > >>>>>>>>> Yes, as it should > > >>>>>>>>> > > >>>>>>>>> On Sat, Feb 29, 2020, 12:25 Dan Smith <dsm...@pivotal.io> > wrote: > > >>>>>>>>> > > >>>>>>>>>> Doesn't the build fail when concourse times out? > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>> > > >>>>> > > >>>>> > > >>>> > > >>> > > >> > > > > >