Thanks for putting this together! :)

I just have a question, and an idea which may or may not be doable...

On 01/24/2018 03:55 PM, Josh Bowman-Matthews wrote:
> Hi folks! I've recently noticed some inefficient usage of our CI
> resources, and I wanted to make sure that everybody is aware of the best
> practices.
> 
> #1: The Buildbot grid
> 
> http://build.servo.org/grid exists, and shows you all of the failures
> across all builders for a given revision. You can expand the view as
> wide as you want:
> 
> http://build.servo.org/grid?width=20
> 
> Similarly, if you prefer a vertical view instead:
> 
> http://build.servo.org/tgrid?length=10
> 
> This view still requires you to figure out the association between git
> revisions and PRs, unfortunately, so I've filed
> https://github.com/servo/servo/issues/19853 about creating something
> more useful.
> 
> http://build.servo.org/console also exists, but doesn't always show the
> information in the most efficient way possible.
> 
> 
> 
> #2: Try builds do not get automatically cancelled
> 
> Retrying a try build while an existing one is in progress means we end
> up using up our CI resources much less efficiently. We do not have the
> ability to retry individual failed builds or platforms, so please use
> the grid (see #1) to decide if retrying will actually tell you anything
> useful.
> 
> 
> #3: ARM, Android, and Windows builds do not run tests
> 
> Recently some of our builders have been having some problems that
> manifest as weird failures when performing basic git operations. Since
> homu reports these failures immediately, it can be tempting to just
> retry the whole operation. This consumes a lot of extra build power (see
> #2), and if a PR is not expected to affect the build on the failed
> machine in any way (since they don't run tests), please consider relying
> on the grid (see #1) rather than retrying until you get a success notice
> from homu.

Just to make sure, how should we deal with when one of these failures
happen and prevent a PR from landing?

I suspect right now `retry` is the only thing we have, and as you
noticed retrying soon is not great... Should we just wait until the rest
of the builds are over? Should we teach homu to not retrigger runs that
are still in progress?

Kind-of related, it'd be great to have a way to report the status for
all the platforms on the PR without having to look at the grid, instead
of just getting the first one that fails... Maybe it could be doable to
extend the Github support for CI services (the same "Some checks were
not successful" UI that shows Appveyor / Travis / TaskCluster to report
one entry per platform homu runs on, instead of just a single one?

Or maybe just a command that reports the status of the try build for
that PR could be good enough... Not sure :)

 -- Emilio

> 
> That's all! Thanks for being considerate in how you make use of our
> shared resources while we work on finding better solutions for these
> issues.
> 
> Cheers,
> Josh
> _______________________________________________
> dev-servo mailing list
> dev-servo@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-servo
_______________________________________________
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo

Reply via email to