also cc @anijain2305 @xyzhou
---
[Visit Topic](https://discuss.tvm.ai/t/cuda-fp16-example/6190/2) to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discuss.tvm.ai/email/unsubscribe/a88a010db72df27b9f313d793cec8
In my case, these intermediate structs are strongly tied to our executor. They
are plain structs, so much easier to work with than full blown relay IR. So for
me they are not really overhead.
---
[Visit
Topic](https://discuss.tvm.ai/t/external-codegen-constant-tensors-in-c-codegen/5890/25
@masahi I see, thanks for sharing. I also thought about an approach like this.
It seems like a lot of additional overhead to create and maintain a whole new
IR for serialization. Since it seems to be the case that many external codegens
(not just TRT) will need to do something like this, I won
@trevor-m Thanks for confirming. I can't talk about specific, but let's just
say any cpp serialization lib should be able to serialize/deserialize structs
into a binary blob, and I am just using one of them.
Note that I'm not serializing Relay subgraph as it is, but some structs that
get con
This is a careless typo in the tutorial. We actually need `visit_constant`. A
PR to fix it is welcome.
---
[Visit
Topic](https://discuss.tvm.ai/t/custom-pass-is-not-working-from-tutorial/5549/4)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe fro
@zhiics could you take a look at this question?
---
[Visit
Topic](https://discuss.tvm.ai/t/custom-pass-is-not-working-from-tutorial/5549/3)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discuss.tvm.ai/ema
Hi @masahi, that is correct. I am using [TVM's native json serialization
API](https://github.com/neo-ai/tvm/blob/dev/include/tvm/node/serialization.h#L39-L48)
to serialize the relay subgraphs during codegen and deserialize it in the
runtime for my TRT integration.
I am curiously what binary f
Thanks @masahi for chiming in. The CCompiler example is only used for
demonstration purpose. We intentionally made it simple to handle cases like
constant. For real external codegen tools/compilers, you may have your own ways
to handle the constant pool, and it is very backend compiler depende
I think TensorRT integration by AWS works in a similar way. If I remember
correctly, they use json instead of binary.
---
[Visit
Topic](https://discuss.tvm.ai/t/external-codegen-constant-tensors-in-c-codegen/5890/20)
to respond.
You are receiving this because you enabled mailing list mod
Ah, I understand now. We'll have a look at how viable that'll be for ACL.
Thanks for the suggestion!
---
[Visit
Topic](https://discuss.tvm.ai/t/external-codegen-constant-tensors-in-c-codegen/5890/19)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscrib
"The executor" part, including API calls to DNNL, is defined in another lib
that is built outside of TVM, and linked to my TVM build. My TVM external
runtime passes binary or deserialized graph rep together with arguements from
TVM to that lib, and this lib knows how to execute the graph. The
How are you compiling? We could serialize the graph, but we'd then need to
codegen the relevant ACL API calls on the remote and compile it into something
that can be executed. We can't do that without a toolchain though which can't
be guaranteed.
---
[Visit
Topic](https://discuss.tvm.ai/
hmm, for my use case, I simply serialize a Relay subgraph into some binary
format, pass the binary to runtime module and deserialize there. Then I can
execute this graph with arguments I recieve from TVM in whatever way I like,
including offloading to dnnl. This week I integrated upstream chan
@masahi Thanks for the suggestion, however since ACL is a c++ library ideally
we would want to be able to cross-compile our codegen before using it on the
remote device. I don't think we can assume the remote device has its own
toolchain to compile the codegen we receive.
---
[Visit
Topi
I noticed the new codes about autodiff at tir level. Most of the code is clear
but one implementation detail is confusing.
File `src/te/autodiff/jacobian.cc` line 182-270. Why does the new ReduceNode
need to return original values? Does it mean that during backward process, some
forward comput
@matt-arm Have you considered using different codegen than CSource? To deal
with large constants, I think binary serialization based codegen is a good fit.
---
[Visit
Topic](https://discuss.tvm.ai/t/external-codegen-constant-tensors-in-c-codegen/5890/14)
to respond.
You are receiving thi
I'm not sure what you are asking. Whatever qconfig you quantize your Torch
model with, the converted Relay model is equivalent to the quantized Torch
model.
But due to the difference in numerics, the raw floating point output between
quantized torch models and converted Relay models can be sl
I've had a chance to look at this now and it seems like it's quite a
fundamental issue with C codegen, not just ACL. This will make a lot of
compile-time optimisations impossible as there's no reasonable way to handle
large constant tensors in the codegen. This we be especially prevalent when
I'm going to use pytorch frontend to parse a pytorch model and quantize the
model. However, It's not clear for me, how should I set the pytorch
quantization API to get same arithmetic results as Relay.
For example, if I set the [QConfig
API](https://pytorch.org/docs/stable/quantization.html#
19 matches
Mail list logo