[Apache TVM Discuss] [Development] Phasing out Legacy Components

2024-09-17 Thread Varun Nawathey via Apache TVM Discuss
I am not super familiar with the Unity direction, but keeping BYOC sounds like a good idea. I don't know if this is how its supposed to be used, but I am using it as "catch all" way to extend TVM. I'm currently adding some custom opencl kernels for depthwise conv2d: the way that I am planning

[Apache TVM Discuss] [Development] Phasing out Legacy Components

2024-09-17 Thread Varun Nawathey via Apache TVM Discuss
as long as LLM workloads are still composed tensor programs, then TVM just has to positiion itself as a more general tensor program compiler moreso than an ML compiler. The tensor expression and Ansor projects look perfectly suited for this/ --- [Visit Topic](https://discuss.tvm.apache.o

[Apache TVM Discuss] [Development] Phasing out Legacy Components

2024-09-17 Thread Varun Nawathey via Apache TVM Discuss
Thanks! I'm not familiar with this project BitBlas. Please correct me if I am wrong: in the code you showed, the IRModule pass that retrieves the threadblock dimensions is [get_annotated_device_mod](https://github.com/microsoft/BitBLAS/blob/2f6d316be9f9d70f2845c2f319ac2f348d0cd6a6/bitblas/uti

[Apache TVM Discuss] [Development] Phasing out Legacy Components

2024-09-16 Thread Lei Wang via Apache TVM Discuss
@varunnaw Good point, in my project we use this approach to retrieve attributes, including the dynamic shared memory size and block/grid information, which might be helpful to you. https://github.com/microsoft/BitBLAS/blob/main/bitblas/builder/wrapper/tir.py#L64-L80 ## Why this is important?

[Apache TVM Discuss] [Development] Phasing out Legacy Components

2024-09-16 Thread Varun Nawathey via Apache TVM Discuss
One suggestion that I have for TVM is to add a cleaner exit from the stack. For example, for opencl/ cuda targets, what do I do if I just want the generated kernels? Note: there is a way to print the source for CL, but unfortunately I have not found a way to get the work group / threadblock s

[Apache TVM Discuss] [Development] Phasing out Legacy Components

2024-09-15 Thread Siyuan Feng via Apache TVM Discuss
LLMs are fundamentally transforming the paradigm of ML deployment and compilation. Simultaneously, the increasing complexity of ML optimization pipelines has rendered many legacy components inadequate for meeting rapidly evolving requirements. On the other hand, the open-source community face

[Apache TVM Discuss] [Development] Phasing out Legacy Components

2024-09-15 Thread tqchen via Apache TVM Discuss
that is right, in such case, we will need to ensure downstream project structured to depend on the same libtvm. So both projectA, and projectB depends on the same upstream TVM (via include dependency), but also build new optimization transformations on-top. That does mean we need to restructu

[Apache TVM Discuss] [Development] Phasing out Legacy Components

2024-09-15 Thread Lei Wang via Apache TVM Discuss
@tqchen, thanks! This is exactly what we are expecting. However, last time I tried to bring my own tuner into `mlc-llm`, I encountered an issue: ```python import tvm # upstream relax_mod = relax_transform(relax_mod) import welder relax_mod = welder.tune(relax_mod) # something bad happened ``

[Apache TVM Discuss] [Development] Phasing out Legacy Components

2024-09-15 Thread tqchen via Apache TVM Discuss
Thanks @LeiWang1999 , I think the main goal here would be to ensure that the IR remain as a common shared parts. Different projects can have their own defined transformations and leverages the main code-base. That would enable us to reuse different tuner and transformations of the IR out of t

[Apache TVM Discuss] [Development] Phasing out Legacy Components

2024-09-15 Thread Lei Wang via Apache TVM Discuss
Completely agree with these perspectives. Another observation I have is that projects developed based on TVM are often not straightforward; they typically require hacking the underlying TVM code. For example, in the Ladder project (based on Welder), we added support for MFMA and HIP code gener

[Apache TVM Discuss] [Development] Phasing out Legacy Components

2024-09-15 Thread tqchen via Apache TVM Discuss
Over the past year, the community has worked hard to bring in and transition to a more flexible and productive flow for ML compilers. One lesson we learned is that it is hard to build a silver bullet for everything. Additionally, given the amount of time and energy contributed by community vol