@Mousius thanks for this RFC and apologies for the long delay. I read this in
conjunction with [[RFC] Arm® Ethos™-U
Integration](https://discuss.tvm.apache.org/t/rfc-arm-ethos-u-integration/10504)
to try to understand the initial application. I think that should be a
sufficient example, but let me know if there are other use cases I should
consider.
I discussed this with @jroesch and mbs-octoml at length a couple days ago.
Documenting our discussion here.
Overall:
- We agree there should be a way to leverage external codegen without
recreating the entire compilation pipeline.
- We want to ensure that this work is compatible with the ongoing TEcompiler
refactor work--specifically, the TE-compiler refactor is now going to move
towards unifying Relay -> TIR lowering (and later unifying lower down the
pipeline) across the Graph, AOT, and VM executors.
- To that end, the case for a `relay_to_tir` hook and a `tir_to_runtime` hook
seems clear. We'd like to clarify the interface of this hook, and propose:
```
relay_to_tir(const IRModule& ir_module, const relay::Function& function) ->
(IRModule, GlobalVar)
```
The contract is TVM calls this interface with a read-only view of the
IRModule containing `function`, plus the function in question to lower. The
hook implementation should return an IRModule containing one or more functions
implementing the lowered Relay function, plus a GlobalVar indicating the symbol
name of the "top-level" function of that operator (in case multiple TIR
functions are created to implement the operator).
At present, TVM keeps the returned IRModule separate from the remaining
lowered code. In the future, as part of the TECompiler refactor, TVM will merge
the returned IRModule in with all other TIR functions, handling name conflicts.
- For the `tir_to_runtime` hook, we presume this will follow the existing
`relay.ext.` interface, just it will be specific to the target rather than a
compiler attribute marked onto the relay Function.
- In terms of user interface: theoretically it should be possible to hand TVM
an unannotated Relay function plus a Target which specifies the available
CPUs/accelerators, and TVM should leverage its knowledge of schedules to assign
functions to devices. Currently, we either specify a mostly-homogenous target
or manually mark functions to be run externally. In the future, we're pondering
that the interface could be: either TVM will assign the each function call to a
target; or you can override this and mark it manually using a per-call-site or
per-function attribute. In this case, the target contained in that attribute is
not a composite Target, but instead a shorthand descriptor for one of the
pieces of the overall Target. For example, Target could be specified as
`low_power_cpu: c -mcpu=cortex-m0; inference_cpu: c -mcpu=cortex-m7f`, and call
sites could be assigned to either `low_power_cpu` or `inference_cpu`. Does this
sketch of a direction align with how you'd like to enable these target hooks?
---
[Visit
Topic](https://discuss.tvm.apache.org/t/pre-rfc-additional-target-hooks/10430/4)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discuss.tvm.apache.org/email/unsubscribe/85521ddab77b2abf69f40e527349a5464d4fdf016448514cd671db3c326e31ec).