The perf team and the A-Team would like to test out a new policy: we want to back out patches that cause significant Talos regressions on Windows builds. We would like to get developers’ feedback before starting this experiment.
Why are we doing this? Essentially, we would like more Talos regressions to get fixed and Firefox performance to improve. We want to test a 48-hour backout policy because we noticed that patch authors often don't fix regressions quickly. If a regression sits in the tree for days, it becomes difficult to back out, and it becomes much more likely that the regression will end up riding the trains to Release by default. Note that we already have a policy of backing out regressions of 3% or more if the patch author does not respond at all on the regression bug for 3 business days. See https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling The new policy is more aggressive. We think a patch that regresses performance significantly should be backed out quickly, and re-landed when its performance is acceptable. How will the backouts work? The A-Team perf sheriffs will create a Talos regression bug as soon as a regression is confirmed using Talos re-triggers. The patch author and reviewer will be CC’ed, and if they don’t provide an explanation for why the regression is acceptable, the patch will be backed out. The goal is to back out unjustified regressions within 48 hours of landing. We’d like to give the patch author about 24 hours to reply after the regression bug is filed. The A-Team has been working hard on improving the tools for understanding Talos regressions (e.g. Perfherder), and we think debugging a Talos regression is a much less painful process these days. For example, there is now a highly usable view to visualize the comparison between a proposed patch against a baseline revision at https://treeherder.mozilla.org/perf.html#/comparechooser Will all regressions get backed out? No. Only regressions of 10% or more, on reliable Talos tests, on Windows, will face automatic backouts during this trial period. Given historical trends, we are anticipating about one backout per week. List of reliable Talos tests: ts_paint, sessionrestore, tp5o_scroll, tscrollx, tresize, TART, tsvgx, tp5o How are you testing this new policy? First off, we want developers’ feedback before trialing this new policy. I would like us to collect feedback and then start enforcing the new policy sometime next week. You can talk to us on the newsgroups or on #perf. You can also reach me directly (my IRC nick is “vladan”). We’ll post our conclusions on the newsgroups after a couple of months of enforcing the policy, and then we can all re-evaluate the backout policy together. Who will be the perf sheriffs? Joel Maher, Vaibhav Agrawal _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform