Hi,

First, I just wanted to introduce myself to the MXNet community. I’m Joe
and will be working with Chai and the AWS team to improve some issues
around MXNet CI. One of our goals is to reduce the costs associated with
running MXNet CI. The task I’m working on now is this issue:


https://github.com/apache/incubator-mxnet/issues/17802


Proposal: Staggered Jenkins CI pipeline


Based on data collected from Jenkins, around 55% of the time when the
mxnet-validation CI build is triggered by a PR, either the sanity or
unix-cpu builds fail. When either of these builds fail, it doesn’t make
sense to run the rest of the pipelines and utilize all those resources if
we’ve already identified a build or unit test failure.


We are proposing changing the MXNet Jenkins CI pipeline by requiring the
*sanity* and *unix-cpu* builds to complete and pass tests successfully
before starting the other build pipelines (centos-cpu/gpu, unix-gpu,
windows-cpu/gpu, etc.) Once the sanity builds successfully complete, the
remaining build pipelines will be triggered and run in parallel (as they
currently do.) The purpose of this change is to identify faulty code or
compatibility issues early and prevent further execution of CI builds. This
will increase the time required to test a PR, but will prevent unnecessary
builds from running.


Does anyone have any concerns with this change or suggestions?


Thanks.

Joe Evans

[email protected]

Reply via email to