As part of a larger effort to improve the experience around debugging intermittents, I've been looking at reducing the time it takes for common "try" workloads for developers (so that e.g. retriggering a job to reproduce a failure can happen faster).

Of course, the common advice of "profile before you optimize" applies here. That is to say, we want to spend time optimizing what makes this particular task painful, rather than optimizing try in general.

To start with, I've been using a manually generated dump of the jobs posted to try on treeherder in a 10 day span. You can see some preliminary results of me manipulating the data to get some rough information (last job for each try push, and total machine time for each job) here:

http://nbviewer.jupyter.org/url/people.mozilla.org/%7Ewlachance/try%20analysis.ipynb

Before I go much further, I'd love to know if anyone has done similar explorations in the recent past and if there are any aggregation points other than Treeherder (taskcluster? buildbot?) which are storing this information: my working assumption is that treeherder is the best place to analyze this information (since it acts as a central point of aggregation), but I'm open to being proven wrong.

Also, accounts of specific try workloads of this type which are annoying/painful would be helpful. :) I think I have a rough idea of the particular type of try push I'm looking for (not pushed by release operations, at least one retrigger) but it would be great to get firsthand confirmation of that.

Will
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to