The perf team and the A-Team would like to test out a new policy: we want
to back out patches that cause significant Talos regressions on Windows
builds. We would like to get developers’ feedback before starting this
experiment.

Why are we doing this?

Essentially, we would like more Talos regressions to get fixed and Firefox
performance to improve. We want to test a 48-hour backout policy because we
noticed that patch authors often don't fix regressions quickly. If a
regression sits in the tree for days, it becomes difficult to back out, and
it becomes much more likely that the regression will end up riding the
trains to Release by default.

Note that we already have a policy of backing out regressions of 3% or more
if the patch author does not respond at all on the regression bug for 3
business days. See
https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling

The new policy is more aggressive. We think a patch that regresses
performance significantly should be backed out quickly, and re-landed when
its performance is acceptable.

How will the backouts work?

The A-Team perf sheriffs will create a Talos regression bug as soon as a
regression is confirmed using Talos re-triggers. The patch author and
reviewer will be CC’ed, and if they don’t provide an explanation for why
the regression is acceptable, the patch will be backed out. The goal is to
back out unjustified regressions within 48 hours of landing. We’d like to
give the patch author about 24 hours to reply after the regression bug is
filed.

The A-Team has been working hard on improving the tools for understanding
Talos regressions  (e.g. Perfherder), and we think debugging a Talos
regression is a much less painful process these days. For example, there is
now a highly usable view to visualize the comparison between a proposed
patch against a baseline revision at
https://treeherder.mozilla.org/perf.html#/comparechooser

Will all regressions get backed out?

No. Only regressions of 10% or more, on reliable Talos tests, on Windows,
will face automatic backouts during this trial period. Given historical
trends, we are anticipating about one backout per week.


List of reliable Talos tests: ts_paint, sessionrestore, tp5o_scroll,
tscrollx, tresize, TART, tsvgx, tp5o

How are you testing this new policy?

First off, we want developers’ feedback before trialing this new policy.

I would like us to collect feedback and then start enforcing the new policy
sometime next week. You can talk to us on the newsgroups or on #perf. You
can also reach me directly (my IRC nick is “vladan”).

We’ll post our conclusions on the newsgroups after a couple of months of
enforcing the policy, and then we can all re-evaluate the backout policy
together.

Who will be the perf sheriffs?

Joel Maher, Vaibhav Agrawal
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to