Re: Optimizing what runs on which push

Benjamin Smedberg Wed, 31 May 2017 14:01:15 -0700

On Wed, May 31, 2017 at 11:08 AM, Dustin Mitchell <dmitch...@mozilla.com>
wrote:


>
>
> So the proposal addresses three issues:
>  - frustrating and ineffective try user interface
>  - high load and consequent long wait times
>  - elevated backout rate
>


Let me try and repeat back the assumptions/assertions I've heard, to see if
I understand.

#1 try user interface is a big problem
#2 some people are running too many try jobs
#3 too many try jobs causes long wait times
#4 too many try jobs costs too much money
#5 some people aren't running enough try jobs
#6 not enough try jobs is causing an elevated backout rate
#7 a computer can do a better job today than people can at picking jobs
#8 which will simultaneously solve the problem of over-running and
under-running jobs
#9 which will improve the patch sticking/backout rate



I'm perhaps missing the key part of your proposal, which is the actual
logic the system uses to automagically determine  which jobs to run. I see
a lot of description about AFFECTS in the build tree, but it's not clear to
me who is adding these annotations and how they are maintained.

Manual annotations aren't cheap: they aren't cheap to add, and they
certainly aren't cheap to maintain. I don't know the exact dollar figures
we're talking about here, but it doesn't take much developer time to swamp
the cost of running more jobs.

There's also perverse incentives in annotations: the cost of excluding too
little is that we keep running more tests than we need to, but the cost of
excluding too much is that we don't run enough tests before final commit.
The natural reaction is going to be leaning on the side of excluding very
little.

Is there a feedback loop where we know via code coverage or some other
magic which files are run as part of which tests? That would avoid the
massive pitfalls of manual annotations (but create its own set of
headaches).



>
> > Related to this, you need to remember one of the primary functions of
> try is
> > to validate changes before landing on inbound/autoland, and so reduce the
> > backout rate from sheriffs. Running subsets of tests will increase the
> > backout rate. I think that's probably ok, but we need to be aware of this
> > social/workflow impact as it's not just a technical decision.
>
> You'll note that a consequence of the proposal would be to *decrease*
> the backout rate.  To achieve this, we must be careful about two
> things:
>  1. Be conservative in what we decide to skip (better to run a
> trivially green job than skip an orange)
>  2. Provide tools that figure out what needs to run in try, to avoid
> the double-whammy described above
>
> #1 is why I built optimization around "skipping" instead of "running"
> -- the burden of proof should be on the task configuration to say when
> a job can be safely skipped, rather than trying to enumerate all the
> conditions in which the job should be run.  #2 is behind the "there is
> no try, just do"[*] behavior in the proposal.  But "will this push
> break things" is not the only use-case for try, so we need "try this"
> functionality (requesting specific jobs) for most of the other
> use-cases.
>

I think I don't believe your assertion. I think that given the way our code
is structured, there is no way for a machine in the typical case to run a
smaller subset of jobs than a human, and so the machine is going to have to
run more jobs than a human in most cases in order to reduce the backout
rate. And that this will ultimately increase the $$/machine time. Or the
machine will have to guess with a smaller subset, and have a higher backout
rate later.

As an example: typically I develop either on a Window machine or a Linux
machine. Sometimes both. So often what I do is run the relevant tests
locally (dom/plugins and a few others). Then I run a try push on the other
platforms only (Mac, maybe android depending). I see many of our core
developers adopting similar strategies.

I hope you can prove me wrong, but I'm deeply skeptical that the tradeoffs
involved here are going to get us *both* reduced machine consumption and
more thorough testing, without a significantly larger scope.

--BDS

_______________________________________________
dev-builds mailing list
dev-builds@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-builds

Re: Optimizing what runs on which push

Reply via email to