subject:"Re\: Policy for disabling tests which run on TBPL"

Re: Policy for disabling tests which run on TBPL

2014-05-10 Thread Alex Burr

Just ran across this thread. I'm not quote sure its what you're thinking of, but this may be of interest: https://github.com/Ealdwulf/bbchop It's a tool for bisection of intermittent bugs, based on Bayesian search theory. That is it, is supposed to find the intermittent bug, as opposed to fin

Re: Policy for disabling tests which run on TBPL

2014-04-18 Thread jmaher

I have made small edits thanks to Kyle and Karl, the official policy is posted on the Sheriffing wiki page: https://wiki.mozilla.org/Sheriffing/Test_Disabling_Policy ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/l

Re: Policy for disabling tests which run on TBPL

2014-04-15 Thread Karl Tomlinson

Thank you for putting this together. It is important. jmaher writes: > This policy will define an escalation path for when a single test case is > identified to be leaking or failing and is causing enough disruption on the > trees. > Exceptions: > 1) If this test has landed (or been modified) i

Re: Policy for disabling tests which run on TBPL

2014-04-15 Thread jmaher

On Tuesday, April 15, 2014 9:42:25 AM UTC-4, Kyle Huey wrote: > On Tue, Apr 15, 2014 at 6:21 AM, jmaher wrote: > > > This policy will define an escalation path for when a single test case is > > identified to be leaking or failing and is causing enough disruption on the > > trees. Disruption is

Re: Policy for disabling tests which run on TBPL

2014-04-15 Thread Kyle Huey

On Tue, Apr 15, 2014 at 6:21 AM, jmaher wrote: > This policy will define an escalation path for when a single test case is > identified to be leaking or failing and is causing enough disruption on the > trees. Disruption is defined as: > 1) Test case is on the list of top 20 intermittent failure

Re: Policy for disabling tests which run on TBPL

2014-04-15 Thread jmaher

I want to express my thanks to everyone who contributed to this thread. We have a lot of passionate and smart people who care about this topic- thanks again for weighing in so far. Below is a slightly updated policy from the original, and following that is an attempt to summarize the thread an

Re: Policy for disabling tests which run on TBPL

2014-04-09 Thread Ehsan Akhgari

On 2014-04-09, 6:46 PM, Chris Peterson wrote: On 4/9/14, 11:48 AM, Gregory Szorc wrote: I feel a lot of people just shrug shoulders and allow the test to be disabled (I'm guilty of it as much as anyone). From my perspective, it's difficult to convince the powers at be that fixing intermittent fa

Re: Policy for disabling tests which run on TBPL

2014-04-09 Thread Chris Peterson

On 4/9/14, 11:48 AM, Gregory Szorc wrote: I feel a lot of people just shrug shoulders and allow the test to be disabled (I'm guilty of it as much as anyone). From my perspective, it's difficult to convince the powers at be that fixing intermittent failures (that have been successfully swept under

Re: Policy for disabling tests which run on TBPL

2014-04-09 Thread Gregory Szorc

On 4/9/14, 2:07 PM, Karl Tomlinson wrote: Gregory Szorc writes: 2) Run marked intermittent tests multiple times. If it works all 25 times, fail the test run for inconsistent metadata. We need to consider intermittently failing tests as failed, and we need to only test things that always pass.

Re: Policy for disabling tests which run on TBPL

2014-04-09 Thread Karl Tomlinson

Gregory Szorc writes: > 2) Run marked intermittent tests multiple times. If it works all > 25 times, fail the test run for inconsistent metadata. We need to consider intermittently failing tests as failed, and we need to only test things that always pass. We can't rely on statistics to tell us a

Re: Policy for disabling tests which run on TBPL

2014-04-09 Thread Gregory Szorc

On 4/9/14, 11:29 AM, L. David Baron wrote: On Wednesday 2014-04-09 11:00 -0700, Gregory Szorc wrote: The simple solution is to have a separate in-tree manifest annotation for intermittents. Put another way, we can describe exactly why we are not running a test. This is kinda/sorta the realm of b

Re: Policy for disabling tests which run on TBPL

2014-04-09 Thread L. David Baron

On Wednesday 2014-04-09 11:00 -0700, Gregory Szorc wrote: > The simple solution is to have a separate in-tree manifest > annotation for intermittents. Put another way, we can describe > exactly why we are not running a test. This is kinda/sorta the realm > of bug 922581. > > The harder solution is

Re: Policy for disabling tests which run on TBPL

2014-04-09 Thread Gregory Szorc

On 4/8/14, 6:51 AM, James Graham wrote: On 08/04/14 14:43, Andrew Halberstadt wrote: On 07/04/14 11:49 AM, Aryeh Gregor wrote: On Mon, Apr 7, 2014 at 6:12 PM, Ted Mielczarek wrote: If a bug is causing a test to fail intermittently, then that test loses value. It still has some value in that i

Re: Policy for disabling tests which run on TBPL

2014-04-09 Thread Ehsan Akhgari

On 2014-04-08, 6:10 PM, Karl Tomlinson wrote: I wonder whether the real problem here is that we have too many bad tests that report false negatives, and these bad tests are reducing the value of our testsuite in general. Tests also need to be well documented so that people can understand what a

Re: Policy for disabling tests which run on TBPL

2014-04-08 Thread Karl Tomlinson

Aryeh Gregor writes: > On Tue, Apr 8, 2014 at 2:41 AM, Ehsan Akhgari wrote: >> What you're saying above is true *if* someone investigates the >> intermittent test failure and determines that the bug is not >> important. But in my experience, that's not what happens at >> all. I think many peopl

Re: Policy for disabling tests which run on TBPL

2014-04-08 Thread Ehsan Akhgari

On 2014-04-08, 3:15 PM, Chris Peterson wrote: On 4/8/14, 11:41 AM, Gavin Sharp wrote: Separately from all of that, we could definitely invest in better tools for dealing with intermittent failures in general. Anecdotally, I know chromium has some nice ways of dealing with them, for example. But

Re: Policy for disabling tests which run on TBPL

2014-04-08 Thread Chris Peterson

On 4/8/14, 11:41 AM, Gavin Sharp wrote: Separately from all of that, we could definitely invest in better tools for dealing with intermittent failures in general. Anecdotally, I know chromium has some nice ways of dealing with them, for example. But I see that a separate discussion not really rel

Re: Policy for disabling tests which run on TBPL

2014-04-08 Thread L. David Baron

On Tuesday 2014-04-08 11:41 -0700, Gavin Sharp wrote: > I see only two real goals for the proposed policy: > - ensure that module owners/peers have the opportunity to object to > any "disable test" decisions before they take effect > - set an expectation that intermittent orange failures are dealt

Re: Policy for disabling tests which run on TBPL

2014-04-08 Thread Gavin Sharp

I see only two real goals for the proposed policy: - ensure that module owners/peers have the opportunity to object to any "disable test" decisions before they take effect - set an expectation that intermittent orange failures are dealt with promptly ("dealt with" first involves investigation, usua

Re: Policy for disabling tests which run on TBPL

2014-04-08 Thread L. David Baron

On Tuesday 2014-04-08 14:51 +0100, James Graham wrote: > So, what's the minimum level of infrastructure that you think would > be needed to go ahead with this plan? To me it seems like the > current system already isn't working very well, so the bar for > moving forward with a plan that would incre

Re: Policy for disabling tests which run on TBPL

2014-04-08 Thread James Graham

On 08/04/14 15:06, Ehsan Akhgari wrote: On 2014-04-08, 9:51 AM, James Graham wrote: On 08/04/14 14:43, Andrew Halberstadt wrote: On 07/04/14 11:49 AM, Aryeh Gregor wrote: On Mon, Apr 7, 2014 at 6:12 PM, Ted Mielczarek wrote: If a bug is causing a test to fail intermittently, then that test l

Re: Policy for disabling tests which run on TBPL

2014-04-08 Thread Ehsan Akhgari

On 2014-04-08, 8:15 AM, Aryeh Gregor wrote: On Tue, Apr 8, 2014 at 2:41 AM, Ehsan Akhgari wrote: What you're saying above is true *if* someone investigates the intermittent test failure and determines that the bug is not important. But in my experience, that's not what happens at all. I think

Re: Policy for disabling tests which run on TBPL

2014-04-08 Thread Ehsan Akhgari

On 2014-04-08, 9:51 AM, James Graham wrote: On 08/04/14 14:43, Andrew Halberstadt wrote: On 07/04/14 11:49 AM, Aryeh Gregor wrote: On Mon, Apr 7, 2014 at 6:12 PM, Ted Mielczarek wrote: If a bug is causing a test to fail intermittently, then that test loses value. It still has some value in th

Re: Policy for disabling tests which run on TBPL

2014-04-08 Thread James Graham

On 08/04/14 14:43, Andrew Halberstadt wrote: On 07/04/14 11:49 AM, Aryeh Gregor wrote: On Mon, Apr 7, 2014 at 6:12 PM, Ted Mielczarek wrote: If a bug is causing a test to fail intermittently, then that test loses value. It still has some value in that it can catch regressions that cause it to

Re: Policy for disabling tests which run on TBPL

2014-04-08 Thread Andrew Halberstadt

On 07/04/14 11:49 AM, Aryeh Gregor wrote: On Mon, Apr 7, 2014 at 6:12 PM, Ted Mielczarek wrote: If a bug is causing a test to fail intermittently, then that test loses value. It still has some value in that it can catch regressions that cause it to fail permanently, but we would not be able to

Re: Policy for disabling tests which run on TBPL

2014-04-08 Thread Aryeh Gregor

On Tue, Apr 8, 2014 at 2:41 AM, Ehsan Akhgari wrote: > What you're saying above is true *if* someone investigates the intermittent > test failure and determines that the bug is not important. But in my > experience, that's not what happens at all. I think many people treat > intermittent test fa

Re: Policy for disabling tests which run on TBPL

2014-04-07 Thread Ehsan Akhgari

On 2014-04-07, 11:49 AM, Aryeh Gregor wrote: On Mon, Apr 7, 2014 at 6:12 PM, Ted Mielczarek wrote: If a bug is causing a test to fail intermittently, then that test loses value. It still has some value in that it can catch regressions that cause it to fail permanently, but we would not be able

Re: Policy for disabling tests which run on TBPL

2014-04-07 Thread Mike Hoye

On 2014-04-07, 11:12 AM, Ted Mielczarek wrote: It's difficult to say whether bugs we find via tests are more or less important than bugs we find via users. It's entirely possible that lots of the bugs that cause intermittent test failures cause intermittent weird behavior for our users, we simp

Re: Policy for disabling tests which run on TBPL

2014-04-07 Thread Aryeh Gregor

On Mon, Apr 7, 2014 at 6:12 PM, Ted Mielczarek wrote: > If a bug is causing a test to fail intermittently, then that test loses > value. It still has some value in that it can catch regressions that > cause it to fail permanently, but we would not be able to catch a > regression that causes it to

Re: Policy for disabling tests which run on TBPL

2014-04-07 Thread Ted Mielczarek

On 4/7/2014 9:02 AM, Aryeh Gregor wrote: > On Mon, Apr 7, 2014 at 3:20 PM, Andrew Halberstadt > wrote: >> I would guess the former is true in most cases. But at least there we have a >> *chance* at tracking down and fixing the failure, even if it takes awhile >> before it becomes annoying enough t

Re: Policy for disabling tests which run on TBPL

2014-04-07 Thread Aryeh Gregor

On Mon, Apr 7, 2014 at 3:20 PM, Andrew Halberstadt wrote: > I would guess the former is true in most cases. But at least there we have a > *chance* at tracking down and fixing the failure, even if it takes awhile > before it becomes annoying enough to prioritize. If we made it so > intermittents n

Re: Policy for disabling tests which run on TBPL

2014-04-07 Thread Andrew Halberstadt

On 07/04/14 05:10 AM, James Graham wrote: On 07/04/14 04:33, Andrew Halberstadt wrote: On 06/04/14 08:59 AM, Aryeh Gregor wrote: Is there any reason in principle that we couldn't have the test runner automatically rerun tests with known intermittent failures a few times, and let the test pass i

Re: Policy for disabling tests which run on TBPL

2014-04-07 Thread Aryeh Gregor

On Mon, Apr 7, 2014 at 6:33 AM, Andrew Halberstadt wrote: > Many of our test runners have that ability. But doing this implies that > intermittents are always the fault of the test. We'd be missing whole > classes of regressions (notably race conditions). We already are, because we already will s

Re: Policy for disabling tests which run on TBPL

2014-04-07 Thread James Graham

On 07/04/14 04:33, Andrew Halberstadt wrote: On 06/04/14 08:59 AM, Aryeh Gregor wrote: On Sat, Apr 5, 2014 at 12:00 AM, Ehsan Akhgari wrote: Note that is only accurate to a certain point. There are other things which we can do to guesswork our way out of the situation for Autoland, but of cou

Re: Policy for disabling tests which run on TBPL

2014-04-06 Thread Andrew Halberstadt

On 04/04/14 03:44 PM, Ehsan Akhgari wrote: On 2014-04-04, 3:12 PM, L. David Baron wrote: Are you talking about newly-added tests, or tests that have been passing for a long time and recently started failing? In the latter case, the burden should fall on the regressing patch, and the regressing

Re: Policy for disabling tests which run on TBPL

2014-04-06 Thread Andrew Halberstadt

On 06/04/14 08:59 AM, Aryeh Gregor wrote: On Sat, Apr 5, 2014 at 12:00 AM, Ehsan Akhgari wrote: Note that is only accurate to a certain point. There are other things which we can do to guesswork our way out of the situation for Autoland, but of course they're resource/time intensive (basically

Re: Policy for disabling tests which run on TBPL

2014-04-06 Thread Karl Tomlinson

On Fri, 4 Apr 2014 11:58:28 -0700 (PDT), jmaher wrote: > Two exceptions: > 2) When we are bringing a new platform online (Android 2.3, b2g, etc.) many > tests will need to be disabled prior to getting the tests on tbpl. It makes sense to disable some tests so that others can run. I assume bugs

Re: Policy for disabling tests which run on TBPL

2014-04-06 Thread Karl Tomlinson

On Fri, 4 Apr 2014 12:49:45 -0700 (PDT), jmaher wrote: >> overburdened in other ways (e.g., reviews). the burden >> needs to be placed on the regressing change rather than the original >> author of the test. > > I am open to ideas to help figure out the offending changes. My > understanding is m

Re: Policy for disabling tests which run on TBPL

2014-04-06 Thread Ed Morley

On 06 April 2014 14:58:24, Ehsan Akhgari wrote: On 2014-04-06, 8:59 AM, Aryeh Gregor wrote: Is there any reason in principle that we couldn't have the test runner automatically rerun tests with known intermittent failures a few times, and let the test pass if it passes a few times in a row after

Re: Policy for disabling tests which run on TBPL

2014-04-06 Thread Ehsan Akhgari

On 2014-04-06, 8:59 AM, Aryeh Gregor wrote: On Sat, Apr 5, 2014 at 12:00 AM, Ehsan Akhgari wrote: Note that is only accurate to a certain point. There are other things which we can do to guesswork our way out of the situation for Autoland, but of course they're resource/time intensive (basical

Re: Policy for disabling tests which run on TBPL

2014-04-06 Thread Aryeh Gregor

On Sat, Apr 5, 2014 at 12:00 AM, Ehsan Akhgari wrote: > Note that is only accurate to a certain point. There are other things which > we can do to guesswork our way out of the situation for Autoland, but of > course they're resource/time intensive (basically running orange tests over > and over a

Re: Policy for disabling tests which run on TBPL

2014-04-04 Thread L. David Baron

On Friday 2014-04-04 12:49 -0700, jmaher wrote: > > If this plan is applied to existing tests, then it will lead to > > style system mochitests being turned off due to other regressions > > because I'm the person who wrote them and the module owner, and I > > don't always have time to deal with reg

Re: Policy for disabling tests which run on TBPL

2014-04-04 Thread Chris Peterson

On 4/4/14, 2:21 PM, Martin Thomson wrote: On 2014-04-04, at 14:02, Ehsan Akhgari wrote: That's not true, we were in that state once, before I stopped working on this issue. We can get there again if we wanted to. It's just a lot of hard work which won't scale if we only have one person doi

Re: Policy for disabling tests which run on TBPL

2014-04-04 Thread Martin Thomson

On 2014-04-04, at 14:02, Ehsan Akhgari wrote: > That's not true, we were in that state once, before I stopped working on this > issue. We can get there again if we wanted to. It's just a lot of hard work > which won't scale if we only have one person doing it. It’s self-correcting too. Turn

Re: Policy for disabling tests which run on TBPL

2014-04-04 Thread Ehsan Akhgari

On 2014-04-04, 4:58 PM, Jonathan Griffin wrote: With respect to Autoland, I think we'll need to figure out how to make it take intermittents into account. I don't think we'll ever be a state with 0 intermittents. That's not true, we were in that state once, before I stopped working on this is

Re: Policy for disabling tests which run on TBPL

2014-04-04 Thread Ehsan Akhgari

On 2014-04-04, 4:30 PM, Chris Peterson wrote: On 4/4/14, 1:19 PM, Gavin Sharp wrote: The majority of the time identifying the regressing patch is difficult Identifying the regressing patch is only difficult because we have so many intermittently failing tests. Intermittent oranges are one of

Re: Policy for disabling tests which run on TBPL

2014-04-04 Thread Jonathan Griffin

With respect to Autoland, I think we'll need to figure out how to make it take intermittents into account. I don't think we'll ever be a state with 0 intermittents. Jonathan On 4/4/2014 1:30 PM, Chris Peterson wrote: On 4/4/14, 1:19 PM, Gavin Sharp wrote: The majority of the time identifyin

Re: Policy for disabling tests which run on TBPL

2014-04-04 Thread Chris Peterson

On 4/4/14, 1:19 PM, Gavin Sharp wrote: The majority of the time identifying the regressing patch is difficult Identifying the regressing patch is only difficult because we have so many intermittently failing tests. Intermittent oranges are one of the major blockers for Autoland. If TBPL nev

Re: Policy for disabling tests which run on TBPL

2014-04-04 Thread Gavin Sharp

On Fri, Apr 4, 2014 at 12:12 PM, L. David Baron wrote: >> Escalation path: >> 1) Ensure we have a bug on file, with the test author, reviewer, module >> owner, and any other interested parties, links to logs, etc. >> 2) We need to needinfo? and expect a response within 2 business days, this >> s

Re: Policy for disabling tests which run on TBPL

2014-04-04 Thread jmaher

> > > 4) In the case we go another 2 days with no response from a module owner, > > we will disable the test. > > > > Are you talking about newly-added tests, or tests that have been > > passing for a long time and recently started failing? > > > > In the latter case, the burden should fal

Re: Policy for disabling tests which run on TBPL

2014-04-04 Thread Ehsan Akhgari

On 2014-04-04, 3:12 PM, L. David Baron wrote: On Friday 2014-04-04 11:58 -0700, jmaher wrote: As the sheriff's know it is frustrating to deal with hundreds of tests that fail on a daily basis, but are intermittent. When a single test case is identified to be leaking or failing at least 10% of

Re: Policy for disabling tests which run on TBPL

2014-04-04 Thread L. David Baron

On Friday 2014-04-04 11:58 -0700, jmaher wrote: > As the sheriff's know it is frustrating to deal with hundreds of tests that > fail on a daily basis, but are intermittent. > > When a single test case is identified to be leaking or failing at least 10% > of the time, it is time to escalate. > >

52 matches

Mail list logo