[Apache TVM Discuss] [Development] Phasing out Legacy Components

tqchen via Apache TVM Discuss Sun, 15 Sep 2024 07:55:31 -0700


Over the past year, the community has worked hard to bring in and transition to 
a more flexible and productive flow for ML compilers. One lesson we learned is 
that it is hard to build a silver bullet for everything. Additionally, given 
the amount of time and energy contributed by community volunteers, it is hard 
to build and maintain a single compiler pipeline that aims to fit all purposes, 
in our case, all backends and use cases. The engineering complexity inevitably 
grows as we try to grow the combination of models and backends and aim to fit 
everything into a single pipeline.

However, this does not render ML compilers useless. In fact, ML compilers are
becoming increasingly important as new workloads, hardware primitives, and
vertical use cases arise. Instead, the development and continuous improvement
of vertical ML compilers should be part of the ML engineering process.
Additionally, by enable such productive ML compiler development, we can afford
to bring up vertical-specific compiler optimizations, for key use-cases like
LLM and image detection models.

With that goal in mind, we still need to answer a question about “what should
be the common infrastructure we provide can be shared across vertical flows”.
The answer to such question has evolved since the project started. Over the
past year, the community has converged toward the pattern:

![image|690x400](upload://fVuAv13rsFkirsyVxudIhom3f2Y.png)

- Every program is encapsulated by an IRModule, with python-first
printing/parsing support via TVMScript
- Optimizations and lowering are implemented as composable transformations on
the IRModule
- A universal runtime mechanism(through tvm ffi) that naturally maps an
IRModule to runnable component across different environments.

Throughout all these flows, TVMScript serves as a common tool to inspect and
communicate the intermediate steps. By adopting this common flow, different
optimizations and vertical compiler building can happen more organically.
Importantly, it also allows us to strengthen the core while **allowing
downstream projects to add necessary customizations** while making use of the
existing pipelines when needed. Moving towards the lightweight flow also brings
extra benefits in terms of testing. Because most of the optimizations and
importing are tested via structural equality, we benefit from reduce test time
and more unit-level correctness checkings.

Most of the new development as now centers around the new modular flow. In the
meantime, we have been keeping the legacy components around for one year. We
started to see challenges as some components get out of maintenance due to a
lack of development. Additionally, because of the way some of the legacy
components are structured, many tests require integration(instead of structural
equality), taking much CI time and resources.

This post calls for us to move away from legacy components towards the new
flow, specifically:

- Move away from relay toward relax as the graph IR
- Use TensorIR for tensor program schedule over te/schedule
- te remains as a useful tool to construct `tir.PrimFunc`, but not
necessarily the scheduling part.
- Use dlight-style IRModule⇒IRModule transform for rule based scheduling
that is compatible with the modular flow.
- Use MetaSchedule for autotuning, over autotvm and autoschedule

We will encourage community contributions that centralizes the new flow,
including improving frontends and modular optimizations based on the new
approach. Importantly, these latest improvements will have less overhead for
testing and technical coupling in general, as we can structure most of them via
structural equality tests via TVMScript and IRModule⇒IRModule mechanism without
introducing new mechanisms. Feel free to share thoughts.

As we gradually phase out the legacy components, they will remain available
through release branches and taking maintainace patches. Coming back to the
context, field of ML/AI is moving even faster, and we have gone several major
changes in the recent wave of GenAI. These challenges are unique, and calls for
a need for ML projects to reinvent themselves to stay relevant, or becoming
irrelevant. After one year more development and learnings of the new flow, it
is a right time for us to start the move.

---
[Visit
Topic](https://discuss.tvm.apache.org/t/phasing-out-legacy-components/17703/1)
to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click
here](https://discuss.tvm.apache.org/email/unsubscribe/55fbfc0b759c852aba6b9e25d079c79220e40a6ee610adc34a9b3aae2136abfb).

[Apache TVM Discuss] [Development] Phasing out Legacy Components

Reply via email to