Thanks @leandron for the proposal! I agree this will be a great help in monitoring the container rebuild process for problems and should reduce the headache typically involved with updating containers.
I think we could prototype this first using https://discuss.tvm.apache.org/t/ci-how-to-run-your-own-tlcpack-ci-and-proposing-future-improvements-to-https-ci-tlcpack-ai/10123 so we can iterate without impacting CI runtime, and then migrate it to the production TVM CI to run at night when CI load is lessened. Below I scope out a couple ideas for future work which may help to motivate this project. #### Future work: use autobuilt containers for production TVM CI I think it would be interesting to implement this and then consider only allowing containers built by this process to be promoted to official `tlcpack/ci-*` containers. It's likely we would need some additional work over this to provide a flexible enough interface (e.g. build selected containers on-demand, likely gated to committers) to support this workflow. However, the benefit is that all containers would then be built from a known clean revision of TVM, so a reproducible build is more likely to occur. To be sure, this approach doesn't provide 100% reproducibility (the container build process contains a bunch of external dependencies e.g. apt packages, LLVM, etc), it ensures those dependencies are documented and provides us a path to collaborate on future movement in that direction, should we so desire. #### Future work: Build status dashboard I think it would be great to also consider creating a concise status dashboard that shows a matrix of the build outcomes by container and date. This would make it easy to diagnose failures and bisect the range of PRs which may be suspect. #### Future work: TVM Python dependencies https://discuss.tvm.apache.org/t/rfc-python-dependencies-in-tvm-ci-containers/9011 proposed some efforts to capture the set of Python deps used in the CI and improve their consistency. With this process in place, we should be able to finally build the constraints list of x86_64 dependencies. This would allow us to ensure that Python packages in ci-cpu, ci-gpu, and ci-lint match. This has been a point of confusion for me when debugging CI failures in the past. --- [Visit Topic](https://discuss.tvm.apache.org/t/automated-way-to-health-check-tvm-dockerfiles/10347/2) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/1cf9dad55f7362f4edd8378c474f368d8bc4a793237cd43d7ee6bef3ea674203).