Thanks for the good summarization. One concern that I have for this case is
mainly about the coupling of the quantization part with the customized code
generator.
While the application scenario is certainly understandable. We will need to
resolve two questions, as an overall goal of the proje
I think we are getting confused because of the overloaded term quantization. To
be precise, maybe we can stick to certain terms
* *QNN Dialect* - Framework (like TF/PyTorch/MXNet) performs quantization.
Relay parser reads this pre-quantized model and creates a QNN-dialect graph.
QNN ops are l
@zhiics Thanks for your comment. Yes, I just use BYOC to specify which part
should be offloaded. The subgraph can be a blackbox for users.
There are two ways I tried to prepare the package.
1. Cross-compile locally and upload the built lib to the remote server.
[[code](https://github.com/ka
CCing some folks who might be interested @areusch @ziheng
---
[Visit
Topic](https://discuss.tvm.ai/t/rfc-improvements-to-automatic-quantization-for-bare-metal/7108/4)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here]
I agree with @matt-arm that we should be hesitant to use BYOC as a catch-all
for everything we haven't implemented in TVM.
What would help me better understand the motivation for this change is an
example of a quantization flow that isn't easily expressible with TVM's
*internal* facilities.
@kazum Thanks for the effort. It is very interesting. It sounds that you only
need BYOC to do annotation and partitioning as you don't really have a
backend/library for it, right? I am wondering how you package the subgraphs, do
you manually prepare them? Thanks.
---
[Visit
Topic](https:
Closed #5947.
--
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issues/5947#event-3499773289
Thanks everyone for voting. The voting result has been sent out.
--
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issues/5947#issuecomment-652063557
Dear TVM community,
I'm glad to announce the results of the vote.
This vote passes with 12 +1 votes (9 binding), no 0 votes, and 0 -1 vote.
+1 votes
* Tianqi Chen (binding)
* Masahiro Masuda (binding)
* Lianmin Zheng (binding)
* Jared Roesch (binding)
* Thierry Moreau (binding)
* Ziheng Jiang (
LGTM. I think we can rename to `get_calibration_data` or `get_profiling_data`
instead of `calibrate_partition_gaph`. I think calibration means more than
collecting i/o tensors (for quantization, it means choosing min/max such that
quantized data representation is similar to float32 data repres
Hi,
+1 from me.
I checked:
- Incubating in name
- DISCLAIMER exists
- LICENSE and NOTICE are fine
- No unexpected binary files
- Checked PGP signatures
- Checked Checksums
- Code compiles and tests successfully run
Kind Regards,
Furkan KAMACI
On Tue, Jun 30, 2020 at 9:05 AM Henry Saputra
wrot
What relay expands to is memory copy. I want to avoid that. I want to have a
copy-less representation in TIR.
This should really be a no-op, but ends up copying everything.
```
import tensorflow as tf
import tvm
import tvm.relay
g = tf.Graph()
with g.as_default():
u = tf.unstack(tf.placeh
The goal of this RFC is to offload subgraph inference from user devices to high
performance edge servers. The initial code is available
[here](https://github.com/kazum/tvm/tree/remote_runtime), which implements
inference offloading based on BYOC.
# Motivation
The benefit of offloading infere
TensorArray is supported in Relay and TF TensorArray ops can be converted now.
Did you mean something more than these?
---
[Visit Topic](https://discuss.tvm.ai/t/tensor-arrays-in-tir/7135/3) to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from thes
@matt-arm For each BYOC backend such as DNNL, we could define a transform
sequence so that we can have `mod = transform.partition("dnnl")(mod)`. However,
there are some issues should be further discussed. For example, where should we
put those transform sequences (e.g., put them under `tvm.tr
It is necessary for many usecases (like AOT), and I believe @tqchen has some
idea on this too.
---
[Visit Topic](https://discuss.tvm.ai/t/tensor-arrays-in-tir/7135/2) to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](
Is there any effort to support tensor arrays in TIR? That would be something
to represent operations like `stack` or `unstack` from TF.
Let's say we want to write an op that does a concatenation of a variable number
of tensors, but without actually copying any data. Instead, it would create
This looks reasonable to me, it's not something we require for Ethos-N but I
can see why it may be desirable. I am noticing quite a bit of API creep around
BYOC though. We never really settled on a way to encapsulate the partitioning
passes and now we have another special pass that may or may
Hello there. The idea is just same with existing IR pass described in
https://discuss.tvm.ai/t/discussion-new-ir-pass-proposal-combineparalleldense/3813
by @jonso . Many sequential network structures conduct group of matmul
operations on same input tensor such as
- gate projections on state
19 matches
Mail list logo