You might want to look into the BYOC flow. [TVM Blog - Bring Your Own Codegeneration](https://tvm.apache.org/2020/07/15/how-to-bring-your-own-codegen-to-tvm)
It looks like a perfect solution for your task. You most likely need to do three things: 1. Define, which subgraphs and nodes need to be mapped to your accelerator 2. Interface the Relay graph-level IR with your codegeneration (do not need to expose it to TVM) 3. Build Interface for runtime/execution, where TVM and your accelerator can exchange input and output tensors --- [Visit Topic](https://discuss.tvm.apache.org/t/heterogeneous-execution-on-tvm-with-custom-accelerator/9557/2) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/e278c6b3b10b4b846c881fd3083057159019bda7b6f13ae07a33eb85f67726c6).