>> Also, I hope this manual override is not a pain to use. Pretty please? >> :) > > The hook attached to the bug requires that you include a short string token > in your commit message. The token is generated as a function of time, your > ldap name, and a local secret. Without specifying the token the hook will > reject your 2nd push, remind you that you can cancel your previous jobs, and > give you the token as well as the time at which the token expires.
That sounds reasonable to me. If it's really a pain in practice, one can always script around it, which I think is a fair trade-off. (If I were to work around it in my git-push-to-try script, I'd require the user to confirm that they wanted the second push; I wouldn't make it automatic.) Thanks. >>> Surface [the leaderboard of try abusers] on tbpl, clearly visible on the >>> inbound pushes. Public shaming ftw. > > The intent here is definitely not public shaming. More like public > awareness. I'm in no position to judge if all those pushes are using try > effectively. Yeah, that's the problem with public shaming or per-developer quotas, agreed. > We're all trying to build the best system we can here. We've been publishing > as much raw data as we can, as well as reports like wait time data for ages. > We're not trying to hide this stuff away. I understand. My point is just that the data we currently have isn't what we actually want to measure. Wait times for individual parts of a try push don't tell the whole story. If Linux-64 wait times go down, what fraction of people get their full try results faster? (That is, how often is Linux-64 on the critical path for a try push?) I honestly don't know. >> If we're going to hold anyone publicly accountable, I think it should >> be the teams which are responsible for ensuring we have enough >> resources to run builds and tests. > > At the same time, it's impossible > to give any kind of SLA when the build/test load is unbounded. I feel like this is over-simplifying the problem to the point of making it impossible to solve; that is, it's a straw man. The requirement for releng/IT is not "build a system which scales to arbitrary load." It's "build a system which scales to current load today and to expected load in the future, adjusting our expectations as time goes on." I hope we all agree that by this metric, we're currently failing. The current infrastructure does not meet demand. (Indeed, demand is actually higher than the jobs we're currently running, because we would very much like to disable coalescing on m-i, but we can't do that for lack of capacity.) All I'm saying is that we currently don't have the right public data to determine, after X amount of time has passed, whether we've made any progress in this respect. (This data would be critical even if we /were/ meeting current demand, because it would alert us to future increases in demand above capacity, which would help us avoid in the getting into the situation we're currently in, where we don't start working on a problem until it starts preventing many people from doing work.) Whether we make our infra meet demand by decreasing demand (e.g. -p any, which it seems like we all agree is a good idea, or the hook asking for confirmation before pushing multiple try builds) or by increasing capacity is an entirely different question. I'm even open to debating who should ultimately be responsible for making our infra meet demand (although I do think that someone needs to own this). But based on my experience, step zero is measuring. >> We should have a public dashboard showing end-to-end tryserver times >> -- starting with a push, how long did it take for all the requested >> tests to complete? And we should surface not only the mean, but >> quantiles -- that is, how long were wait times for the 90th percentile >> of longest wait times? > > I can take another stab at this. However, I'm not sure that try is the best > branch to do this on, since the best-case end-to-end time varies drastically > depending on which platforms/tests were selected, and if the user opted to > cancel or rebuild jobs later. That could be. If the raw data consists of tuples of the form (trychooser params, end-to-end time) or (number of jobs requested, end-to-end time), we can play with it and figure out what's the best way to present the data. >> We've seen that where we don't have tracking -- e.g. for how long it >> takes to push to try [1], or basically for anything else at Mozilla -- >> we often regress the metric we're interested in. You make what you >> measure. If we want consistently fast try pushes, it's hard to >> imagine how we'd get there without public data monitoring exactly the >> thing we're interested in. > > I'm sure IT would love some help in figuring out how to measure this. There were lots of suggestions in the bug. For example, one could periodically push a nop patch queue to try. You could modify the trychooser syntax to support this, but one could also just include invalid trychooser syntax, since at the moment that results in no builds. (That's another one of my pet peeves, but this would let you call it a feature! :) I got the impression that bug was stalled because it's not a priority, not because nobody could figure it out. But perhaps I'm mistaken. -Justin >> [1] https://bugzilla.mozilla.org/show_bug.cgi?id=691459 _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform