On 2014-04-07, 11:49 AM, Aryeh Gregor wrote:
On Mon, Apr 7, 2014 at 6:12 PM, Ted Mielczarek <t...@mielczarek.org> wrote:
If a bug is causing a test to fail intermittently, then that test loses
value. It still has some value in that it can catch regressions that
cause it to fail permanently, but we would not be able to catch a
regression that causes it to fail intermittently.
To some degree, yes, marking a test as expected intermittent causes it
to lose value. If the developers who work on the relevant component
think the lost value is important enough to track down the cause of
the intermittent failure, they can do so. That should be their
decision, not something forced on them by infrastructure issues
("everyone else will suffer if you don't find the cause for this
failure in your test"). Making known intermittent failures not turn
the tree orange doesn't stop anyone from fixing intermittent failures,
it just removes pressure from them if they decide they don't want to.
If most developers think they have more important bugs to fix, then I
don't see a problem with that.
What you're saying above is true *if* someone investigates the
intermittent test failure and determines that the bug is not important.
But in my experience, that's not what happens at all. I think many
people treat intermittent test failures as a category of unimportant
problems, and therefore some bugs are never investigated. The fact of
the matter is that most of these bugs are bugs in our tests, which of
course will not impact our users directly, but I have occasionally come
across bugs in our code code which are exposed as intermittent failures.
The real issue is that the work of identifying where the root of the
problem is often time is the majority of work needed to fix the
intermittent test failure, so unless someone is willing to investigate
the bug we cannot say whether or not it impacts our users.
The thing that really makes me care about these intermittent failures a
lot is that ultimately they make us have to trade either disabling a
whole bunch of tests with being unable to manage our tree. As more and
more tests get disabled, we lose more and more test coverage, and that
can have a much more severe impact on the health of our products than
every individual intermittent test failure.
Cheers,
Ehsan
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform