LLMs are fundamentally transforming the paradigm of ML deployment and
compilation. Simultaneously, the increasing complexity of ML optimization
pipelines has rendered many legacy components inadequate for meeting rapidly
evolving requirements.
On the other hand, the open-source community face
that is right, in such case, we will need to ensure downstream project
structured to depend on the same libtvm. So both projectA, and projectB depends
on the same upstream TVM (via include dependency), but also build new
optimization transformations on-top.
That does mean we need to restructu
@tqchen, thanks! This is exactly what we are expecting. However, last time I
tried to bring my own tuner into `mlc-llm`, I encountered an issue:
```python
import tvm # upstream
relax_mod = relax_transform(relax_mod)
import welder
relax_mod = welder.tune(relax_mod)
# something bad happened
``
Thanks @LeiWang1999 , I think the main goal here would be to ensure that the IR
remain as a common shared parts.
Different projects can have their own defined transformations and leverages the
main code-base. That would enable us to reuse different tuner and
transformations of the IR out of t
Completely agree with these perspectives. Another observation I have is that
projects developed based on TVM are often not straightforward; they typically
require hacking the underlying TVM code. For example, in the Ladder project
(based on Welder), we added support for MFMA and HIP code gener
Over the past year, the community has worked hard to bring in and transition to
a more flexible and productive flow for ML compilers. One lesson we learned is
that it is hard to build a silver bullet for everything. Additionally, given
the amount of time and energy contributed by community vol