On 12/03/15 07:17 PM, Gijs Kruitbosch wrote:
IME the issue is not so much about not running tests identical to the
ones on CI, but the OS environment which doesn't match, and then
reproducing intermittent failures.
If a failure happens once in 100 builds, it is very annoying for the
sheriffs (happens multiple times a day) and needs fixing or backing out
- but running-by-dir, say, mochitest-browser for
browser/base/content/test/general/ 100 times takes way too long, and OS
settings / screen sizes / machine speed / (...) differences mean you
might not be able to reproduce anyway (or in the worst case, that you
get completely different failures).
It'd be better if we could more easily get more information about
failures as they happened on infra (replay debugging stuff a la what roc
has worked on, or better logs, or somehow making it possible to
remote-debug the infra machines as/when they fail).
This is an excellent point. Being able to accurately reproduce the
configuration used in production is obviously a good thing, but only in
limited circumstances. Knowing the command line for an OSX job isn't
going to be any use to you if you don't have an OSX machine to try it on.
Being able to remote debug c++/js/python seems like it would be the holy
grail (short of rr everywhere) of intermittent or non-reproducible test
failures. Currently it would be difficult if not impossible to implement
as the slaves are heavily locked down. I heard that this is something
that might be easier with taskcluster. It would definitely be worth
investigating.
-Andrew
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform