On Wed, Nov 6, 2013 at 5:49 PM, Ryan VanderMeulen <rya...@gmail.com> wrote: > What do we gain by having results that can't be trusted?
The same that we gain from allowing any try pushes that don't run every single test. It's a tradeoff between reliability and time, not black-and-white. For instance, if I change a few lines in a .cpp file in editor/, I know there are a limited number of tests it could plausibly affect, and there's no reason to run mochitests in gfx/ just so I can run mochitests in editor/ and dom/imptests/editing/. Likewise, if I got try failures that I can't reproduce locally and pushed a revised patch to test the fix, it would be nice to run just the affected tests. More than once I've submitted a whole series of patches that had to run a whole test suite instead of just a few tests, and therefore took 20 minutes or so more than necessary per iteration. On Wed, Nov 6, 2013 at 6:46 PM, Ryan VanderMeulen <rya...@gmail.com> wrote: > I'm just afraid we're going to end up in the same situation we're already in > with intermittent failures where the developer looks at it and says "that > couldn't possibly be from me" and ignores it. We already see "Try results > look good" backouts on a depressingly-regular basis. The entire situation with how intermittent failures are handled strikes me as mostly a technical problem. Known intermittent failures should be flagged and automatically suppressed, not require manual judgment calls every single time. To ensure that they don't get made non-intermittent, they could be automatically rerun a couple of times (just the file, not the whole suite) if they fail to make sure they pass at least once, and get reported as a real failure if they fail five times in a row or something. Trying to persuade people to be careful of something that isn't a problem 90% of the time is a losing battle -- the signal-to-noise ratio needs to be a lot higher before people will pay attention. On Wed, Nov 6, 2013 at 9:56 PM, Steve Fink <sf...@mozilla.com> wrote: > As for "faster", I'm skeptical of the reliability of incremental try > builds. We have too many clobbers. And the same slaves do a bunch of > different build types, so it may have been quite a while since the last > build of the type you're doing. (Sure, we could tweak the scheduling to > add some affinity in, but that's more complexity and a richer set of > failure patterns.) Right, good point. I didn't think it through. > I'm still a bit curious as to whether allowing slaves > to collaborate via a fast distcc network going to remote ccaches would > work, but it also feels complex and potentially counterproductive (I'm > not at all sure that it's faster to look up and transfer a remotely > cached build result than it would be to recompile it locally.) Why would it not be faster, in terms of throughput? The network transfer should use roughly no CPU time, and builds are mostly CPU-bound, right? Even in terms of latency, I wouldn't expect hashing input file plus making local network request plus transferring response to take more than a millisecond if the cached file is in memory, and not much longer if it's on an SSD, so I don't see how actually compiling the file could compete except for very small files. _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform