2017-05-31 9:26 GMT-04:00 Benjamin Smedberg <benja...@smedbergs.us>:
> Your ultimate goal is to save $$, and my ultimate goal is to reduce the time
> to results to make the developer cycle faster (let's measure round-trip
> times in minutes instead of hours). I don't believe that running subsets of
> the current jobs will solve either of those goals. Maybe it's a step along
> the path, but I don't see how that fits together yet.

We share the goal of getting the most appropriate results to
developers as quickly as possible with the resources we have
available. What we're discussing here is only a part of the work
required (other parts being hyperchunking, faster job startup, and
recompiling only when necessary, to name a few).

One of the longest-lived approaches to getting results efficiently has
been try syntax, and the admonition not to use `-p all`.  The
experience of trying to figure out what syntax to use instead has been
pretty awful, and suffers from both under-estimation (failing to run a
job that would be orange) and over-estimation (running unnecessary
jobs).  It's a double-whammy: time lost figuring out try syntax, then
time lost over a backout due to a missed job in try.

So the proposal addresses three issues:
 - frustrating and ineffective try user interface
 - high load and consequent long wait times
 - elevated backout rate

> Related to this, you need to remember one of the primary functions of try is
> to validate changes before landing on inbound/autoland, and so reduce the
> backout rate from sheriffs. Running subsets of tests will increase the
> backout rate. I think that's probably ok, but we need to be aware of this
> social/workflow impact as it's not just a technical decision.

You'll note that a consequence of the proposal would be to *decrease*
the backout rate.  To achieve this, we must be careful about two
things:
 1. Be conservative in what we decide to skip (better to run a
trivially green job than skip an orange)
 2. Provide tools that figure out what needs to run in try, to avoid
the double-whammy described above

#1 is why I built optimization around "skipping" instead of "running"
-- the burden of proof should be on the task configuration to say when
a job can be safely skipped, rather than trying to enumerate all the
conditions in which the job should be run.  #2 is behind the "there is
no try, just do"[*] behavior in the proposal.  But "will this push
break things" is not the only use-case for try, so we need "try this"
functionality (requesting specific jobs) for most of the other
use-cases.

Dustin

[*] Maybe "just try it" is a better name..
_______________________________________________
dev-builds mailing list
dev-builds@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-builds

Reply via email to