[Apache TVM Discuss] [Development/RFC] [RFC] Python Dependencies in TVM CI Containers

tqchen via Apache TVM Discuss Tue, 09 Feb 2021 10:39:14 -0800


Thanks for the discussions so far. I think everyone seems to agree on T1, it is 
good to have a way to have a place where users and developers can look into. T2 
and T3 are the ones that worth discussing.

Trying to summarizing a few rationales of the current CI pipeline.

### R0: Use Binary Tag for Reproducibility and Stablization

During testing, the dependency being used goes beyond python dependencies(what
a poetry lock file can capture) and will include things like blas version,
nnpack code and cuda runtime.

**Docker binary image** is still the only reliable source of truth if we want
to exactly reproduce the test runs. This is why we opted for the usage of
binary tag and stage dependency images during test dependency update. So
developers can pick up these tags.

### R1: Manual Fixes are Unavoidable During Dependency Update

Changes to APIs like tensorflow, pytorch will likely requires upgrade of the
test cases, which means manual fixes are not avoidable. This factor also
creates slight favor of building docker images separately, so incremental
changes can be made.

### R2: Simple Flow to Build the Docker Image

For different down stream users e.g. folks might prefer a different CI flow
that directly build the docker image from the source. Having a simple flow
`docker/build.sh` to do so is super useful to these points.

### Discussions

The above factors might lead to the conclusion that l1 is likely the simplest
approach to take. They are also the plan that is immediately actionable.

On one hand, l2/l3 indeed captures more python dependencies via a lock, they do
not necessarily capture all the dependency needed(see R0). Additionally, l2/l3
might involve checking in a lock file((which cannot be reviewed as we do not
immediately know the impact of the change) into the repo that was produced from
a file in the repo creating a cyclic-like dependency in the upgrading
process(txt file to lock file).

The original constraint file(can be approved by a human) and binary tag(can be
verified by CI) generally satisfies the two needs(of readability and final
pining pointing), and makes R1 generally easier. Additionally, the set of
python dependencies can be generated through `pip freeze` in the `ci-gpu`
image(or `ci-cpu` image for cpu only packages), which means we still meed our
original requirements of tracking ci deps.

---
[Visit
Topic](https://discuss.tvm.apache.org/t/rfc-python-dependencies-in-tvm-ci-containers/9011/6)
to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click
here](https://discuss.tvm.apache.org/email/unsubscribe/6b078c3d9f03333c8a5955da227883cd084adcec76e20d44a0fa7172ff550978).

[Apache TVM Discuss] [Development/RFC] [RFC] Python Dependencies in TVM CI Containers

Reply via email to