[ https://issues.apache.org/jira/browse/SOLR-12037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17256995#comment-17256995 ]
Erick Erickson commented on SOLR-12037: --------------------------------------- I won't be working on JIRAs for the foreseeable future > Reduce noise from flakey tests > ------------------------------ > > Key: SOLR-12037 > URL: https://issues.apache.org/jira/browse/SOLR-12037 > Project: Solr > Issue Type: Improvement > Components: Tests > Affects Versions: 7.2 > Reporter: Erick Erickson > Assignee: Erick Erickson > Priority: Major > > Recreating SOLR-12016. Please do NOT delete this without discussion. NOTE: > Uwe's build system modifications originally on 12016 have been incorporated > into SOLR-12028. > Current situation concerns: > > There is so much noise from flakey tests (particularly Solr tests) that > > they are difficult to use. > > The number of tests that regularly fail is increasing > > Failures are being ignored > > The number of failing tests makes releasing more difficult. > > The number of failing tests make it harder to determine whether recent > > changes actually caused problems. Running the tests again until they > > succeed is used commonly at present, which is not robust. > > e-mail notifications of failing tests are largely being ignored. > Propsal: > > Mark all currently "flakey" tests as BadApple or AwaitsFix > > Run Jenkins jobs with BadApple (and/or AwaitsFix) enabled and disabled. > > Frequency TBD, depends partly on whether we can label emails from these > > runs for easy filtering of the two flavors. > >> Label these runs with something suitable in the subject line (wish list) > > Weekly reports on the tests labeled BadApple or AwaitsFix > >> Perhaps this could be incorporated in the reports linked below (wish list) > > Committers should enable BadApple (or AwaitsFix) regularly as a sanity > > check. Leave these as defaults. > > We start getting much more aggressive about not allowing new flakey tests. > NOTE: It's perfectly acceptable to have failing flakey tests as long as > someone is activey working on fixing them. > Concerns with solution > > Decreases test coverage > > Decreases visibility of flakey tests, making fixing them less likely. > > Some tools (see below) that report on bad tests will not see tests that are > > annotated with BadApple or AwaitsFix. > > Running unit tests and reporting errors are being conflated > To be decided: > > Can we label e-mails with failing tests with something in the subject line > > identifying whether they were run with BadApple/Awaits fix enabled or > > disabled? Can someone volunteer? > > Is there any difference between BadApple and AwaitsFix? If not should we > > deprecate one? I propose we just use AwaitsFix and deprecate BadApple. > > Can the automated reports (see below) be enhanced to also report tests > > labeled BadApple or AwaitsFix? > Useful tools: > > Steve Rowe's work on a Jenkins job to reproduce test failures (LUCENE-8106) > > Hoss has worked on aggregating all test failures from the 3 Jenkins systems > > (ASF, Policeman, and Steve's), downloading the test results & logs, and > > running some reports/stats on failures. > >> http://fucit.org/solr-jenkins-reports/ > >> https://github.com/hossman/jenkins-reports/ > >> http://fucit.org/solr-jenkins-reports/failure-report.html > I've assigned this JIRA to myslef, but all volunteers welcome, especially > anything that changes the build system..... > I've decided to make this a SOLR jira on the theory that most of the > offending tests are in the Solr hive, any sub-tasks for touching the build > system can go under LUCENE if wanted. > Also, I expect to add the annotation to some more tests for a few days as > infrequent failures occur. Once we have stability (defined by there being > little noise) that'll stop. > 3 BadApple 23 AwaitsFix annotations are currently in the code, linked to > these issues: > HADOOP-9893 > LUCENE-3869 > LUCENE-5575") > LUCENE-5595 > LUCENE-5737 > LUCENE-6709 > LUCENE-7161 > SOLR-2715 > SOLR-6213 > SOLR-6443 > SOLR-6944 > SOLR-10071 > SOLR-10136 > SOLR-10734 > SOLR-11974 > Solr JIRAS about bad tests > SOLR-2175 > SOLR-4147 > SOLR-5880 > SOLR-6423 > SOLR-6944 > SOLR-6961 > SOLR-6974 > SOLR-8122 > SOLR-8182 > SOLR-9869 > SOLR-10053 > SOLR-10070 > SOLR-10071 > SOLR-10139 > SOLR-10287 > SOLR-11911 -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org