
I'd like the post-processing part of the model to be executed on the CPU
instead of the accelerator, is there a method to tell TVM to 'stop'
partitioning for external BYOC compiler at some specific Relay operators (e.g.,
the `Arg
[quote="sho, post:12, topic:11682"]
Like there might be some minor architectures (say for some minor
microcontrollers) that LLVM doesn’t support (so we have to develop LLVM backend
ourselves to be able to emit executables for that minor architecture).
[/quote]
Yep--we intend to keep the `c` ba
The from_onnx tutorial hosted here:
https://tvm.apache.org/docs/how_to/compile_models/from_onnx.html
Relies on a model that generate the following error using onnxruntime code:
```
from tvm.contrib.download import download_testdata
import onnxruntime as ort
model_url = "".join(
[
"
For NLP transformer models, sometime we will share parameters between layers to
reduce the model size and runtime memory.
But after building the model to TVM, it somehow expands all the layers and
create duplicate variable nodes which enlarge the model size a lot. (It depends
on the num of la
HI,
Can TVM support the inference of glow-tts model? If yes, any sample code or
discussion link for it?
thanks
ltshan
---
[Visit
Topic](https://discuss.tvm.apache.org/t/can-tvm-support-the-inference-of-glow-tts-model/11747/1)
to respond.
You are receiving this because you enabled