@Mousius thanks for this RFC and apologies for the long delay. I read this in conjunction with [[RFC] Arm® Ethos™-U Integration](https://discuss.tvm.apache.org/t/rfc-arm-ethos-u-integration/10504) to try to understand the initial application. I think that should be a sufficient example, but let me know if there are other use cases I should consider.
I discussed this with @jroesch and mbs-octoml at length a couple days ago. Documenting our discussion here. Overall: - We agree there should be a way to leverage external codegen without recreating the entire compilation pipeline. - We want to ensure that this work is compatible with the ongoing TEcompiler refactor work--specifically, the TE-compiler refactor is now going to move towards unifying Relay -> TIR lowering (and later unifying lower down the pipeline) across the Graph, AOT, and VM executors. - To that end, the case for a `relay_to_tir` hook and a `tir_to_runtime` hook seems clear. We'd like to clarify the interface of this hook, and propose: ``` relay_to_tir(const IRModule& ir_module, const relay::Function& function) -> (IRModule, GlobalVar) ``` The contract is TVM calls this interface with a read-only view of the IRModule containing `function`, plus the function in question to lower. The hook implementation should return an IRModule containing one or more functions implementing the lowered Relay function, plus a GlobalVar indicating the symbol name of the "top-level" function of that operator (in case multiple TIR functions are created to implement the operator). At present, TVM keeps the returned IRModule separate from the remaining lowered code. In the future, as part of the TECompiler refactor, TVM will merge the returned IRModule in with all other TIR functions, handling name conflicts. - For the `tir_to_runtime` hook, we presume this will follow the existing `relay.ext.` interface, just it will be specific to the target rather than a compiler attribute marked onto the relay Function. - In terms of user interface: theoretically it should be possible to hand TVM an unannotated Relay function plus a Target which specifies the available CPUs/accelerators, and TVM should leverage its knowledge of schedules to assign functions to devices. Currently, we either specify a mostly-homogenous target or manually mark functions to be run externally. In the future, we're pondering that the interface could be: either TVM will assign the each function call to a target; or you can override this and mark it manually using a per-call-site or per-function attribute. In this case, the target contained in that attribute is not a composite Target, but instead a shorthand descriptor for one of the pieces of the overall Target. For example, Target could be specified as `low_power_cpu: c -mcpu=cortex-m0; inference_cpu: c -mcpu=cortex-m7f`, and call sites could be assigned to either `low_power_cpu` or `inference_cpu`. Does this sketch of a direction align with how you'd like to enable these target hooks? --- [Visit Topic](https://discuss.tvm.apache.org/t/pre-rfc-additional-target-hooks/10430/4) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/85521ddab77b2abf69f40e527349a5464d4fdf016448514cd671db3c326e31ec).