Hello TVM community,
I want to implement a custom NPU (or accelerator) backend using the Relax IR in the current TVM main branch. The goal is to enable integration of my custom codegen and runtime to support specialized quantized matrix operations. I would really appreciate guidance on the following points: 1. What is the best approach to register and implement a new target kind for a custom accelerator in the current Relax-based TVM? 2. How should I implement pattern matching or annotation in Relax to identify and partition subgraphs compatible with the custom backend? 3. What is the recommended way to implement the runtime module and `GetFunction` interface for custom codegen that calls my hardware-specific API? 4. How should arguments like tensors be extracted and handled safely within the runtime module’s packed function interface? 5. What are the essential build system integration steps (e.g., CMake, source registration) necessary to enable the custom backend in TVM’s build and Python APIs? I have already read about the Relay BYOC pathway(https://tvm.apache.org/2020/07/15/how-to-bring-your-own-codegen-to-tvm) and would like to understand why that approach does not work for Relax, and what are the key differences or new steps required for custom backend support in the Relax flow. Any pointers, examples, or best practice recommendations would be extremely helpful to get started. Thank you! --- [Visit Topic](https://discuss.tvm.apache.org/t/relax-byoc-how-to-implement-a-custom-npu-backend-with-custom-codegen-and-runtime-in-tvm-main-branch/18515/1) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/b687afd72a2cbc22e35b6b5110058c3edb993453323471920f89be345a0869ca).
