Take this code for example:
import numpy as np
import tvm
from tvm.autotvm.tuner import XGBTuner
from tvm import relay, autotvm
import pytest
def test_dense_autotvm():
target = tvm.target.cuda()
batch, in_dim, out_dim = 16384, 768, 768
data_shape =
i saw the comment in nn.softmax
This operator can be optimized away for inference
for now, the bert performance bottleneck is related with softmax.
what's the meaning of this comment,how to optimize away this op.
the ir may like below:
%1579 = fn (%p0218: Tensor[(128, 12, 128, 128), fl
i wanna find way to automative single op accuracy testing. I use another way to
extract single op ir definition, then run the single op ir func def. not using
tir.
---
[Visit
Topic](https://discuss.tvm.apache.org/t/any-way-to-extract-tir-function-from-relay/11907/3)
to respond.
You are
[[BUG]fix batch matmul not set attrs_type_key, when using tvm.parse.parse_expr
will raise error.](https://github.com/apache/tvm/pull/10209)
---
[Visit
Topic](https://discuss.tvm.apache.org/t/op-testing-single-op-ir-testing-batch-matmul/12049/4)
to respond.
You are receiving this because
ok, the op Registry do have some problem.

batch atmul attrs not set the attrs type.
---
[Visit
Topic](https://discuss.tvm.apache.org/t/op-testing-single-op-ir-testing-batch-matmul/12049/2)
to respond.
You are receiving this bec
df_parsed = tvm.parser.parse_expr(
'''
fn (%p0527: Tensor[(16, 256, 256), float32], %p1361: Tensor[(16, 64, 256),
float32]) -> Tensor[(16, 256, 64), float32] {
nn.batch_matmul(%p0527, %p1361, transpose_b=True) /* ty=Tensor[(16, 256,
64), float32] */
}
''')
the code above
cause tvm leave no origin network layers infomation in the tvm graph, so how
can i use dump data compared with orgin network layers?
for exmaple, the bert-large has 2000+ ops, but which op related to origin layer
is hard to figure out.
when you face accuracy problem, you dump the data and com
cause tvm leave no origin network layers infomation in the tvm graph, so how
can i use dump data compared with orgin network layers?
for exmaple, the bert-large has 2000+ ops, but which op related to origin layer
is hard to figure out.
when you face accuracy problem, you dump the data and com
> All the tensors will be saved as binary bytes in serialized format. The
> result binary bytes can be loaded by the API “load_params”.
how can we achive binary bytes in serialized format?
does "./_tvmdbg_device_CPU_0/output_tensors.params" contains all layers outputs?
then we get data, how to
thx, could u give me some concrete example?
---
[Visit
Topic](https://discuss.tvm.apache.org/t/what-if-the-result-is-not-correct/11858/3)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discuss.tvm.apache.o
@tqchen Any Methods to debug to know which intermediate layer outputs is not
correct.
i seach the whole formu, but cant get answer.
---
[Visit
Topic](https://discuss.tvm.apache.org/t/what-if-the-result-is-not-correct/11858/1)
to respond.
You are receiving this because you enabled mailing
how to dump this graph?
---
[Visit
Topic](https://discuss.tvm.apache.org/t/how-to-extract-tvm-module/2167/22) to
respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discuss.tvm.apache.org/email/unsubscribe/bb6d1de
from pytorch_pretrained_bert import BertForMaskedLM
import torch
def main(args):
bert_model_origin =
BertForMaskedLM.from_pretrained("bert-large-uncased")
example_tensor = torch.randint(0, 100, (1, 256))
model_int8 = torch.quantization.quantize_dynamic(bert_m
13 matches
Mail list logo