Hello TVM community,

I want to implement a custom NPU (or accelerator) backend using the Relax IR in 
the current TVM main branch. The goal is to enable integration of my custom 
codegen and runtime to support specialized quantized matrix operations.

I would really appreciate guidance on the following points:

1. What is the best approach to register and implement a new target kind for a 
custom accelerator in the current Relax-based TVM?
2. How should I implement pattern matching or annotation in Relax to identify 
and partition subgraphs compatible with the custom backend?
3. What is the recommended way to implement the runtime module and 
`GetFunction` interface for custom codegen that calls my hardware-specific API?
4. How should arguments like tensors be extracted and handled safely within the 
runtime module’s packed function interface?
5. What are the essential build system integration steps (e.g., CMake, source 
registration) necessary to enable the custom backend in TVM’s build and Python 
APIs?

I have already read about the Relay BYOC 
pathway(https://tvm.apache.org/2020/07/15/how-to-bring-your-own-codegen-to-tvm) 
and would like to understand why that approach does not work for Relax, and 
what are the key differences or new steps required for custom backend support 
in the Relax flow.

Any pointers, examples, or best practice recommendations would be extremely 
helpful to get started.

Thank you!





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/relax-byoc-how-to-implement-a-custom-npu-backend-with-custom-codegen-and-runtime-in-tvm-main-branch/18515/1)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/b687afd72a2cbc22e35b6b5110058c3edb993453323471920f89be345a0869ca).

Reply via email to