@kazum I can git it a try. How can I extract the related code ? or I have to
git the whole repo and build?
---
[Visit Topic](https://discuss.tvm.ai/t/op-equal-is-not-supported/6303/3) to
respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emai
[quote="anijain2305, post:36, topic:6256, full:true"]
Yes, that seems plausible. Please note that one might also make FP32 schedule
better by working on low-level optimizations :) So, it is relative.
[/quote]
Can I define a new schedule to optimize performance to get the same speed as
QNNPACK
Yes, that seems plausible. Please note that one might also make FP32 schedule
better by working on low-level optimizations :) So, it is relative.
---
[Visit
Topic](https://discuss.tvm.ai/t/is-there-any-speed-comparison-of-quantization-on-cpu/6256/36)
to respond.
You are receiving this be
[quote="anijain2305, post:34, topic:6256, full:true"]
Yeah, the work by AliOS is not available yet. They worked a lot on very
low-level optimizations. Over time, this work will hopefully be upstreamed. For
now, on master, QNNPACK is faster.
[/quote]
Your also said **For rasp3 and rasp4, we saw
Yeah, the work by AliOS is not available yet. They worked a lot on very
low-level optimizations. Over time, this work will hopefully be upstreamed. For
now, on master, QNNPACK is faster.
---
[Visit
Topic](https://discuss.tvm.ai/t/is-there-any-speed-comparison-of-quantization-on-cpu/6256/3
[quote="kindlehe, post:19, topic:6256, full:true"]
@anijain2305
How much speedup does FP32 compared INT8 at rasp4?1.5×?
I saw some speedup conclusion
[here](https://github.com/tvmai/meetup-slides/tree/master/tvm-meetup-shanghai-Nov-16-2019)
saying that tvm is about 1.3×(=2.08/1.60)at mobilene
[quote="anijain2305, post:31, topic:6256, full:true"]
Yes, thats the selling point of TVM.
TVM community works together on these TVM schedules. As we get more people
interested in quantization, we can add more TVM schedules, for e.g., avx2
machine you are talking about. We dont want to fully r
Yes, thats the selling point of TVM.
TVM community works together on these TVM schedules. As we get more people
interested in quantization, we can add more TVM schedules, for e.g., avx2
machine you are talking about. We dont want to fully rely on FBGEMM or QNNPACK,
because it might cause conf
[quote="anijain2305, post:27, topic:6256, full:true"]
For rasp3 and rasp4, we saw 1.3x - 1.5x performance speedup going from FP32 to
Int8.
The link comparing QNNPACK and TVM is not upstream'd yet. If I understand
correctly, it will be sometime before the authors of that work will be able to
m
[quote="masahi, post:28, topic:6256, full:true"]
[quote="kindlehe, post:26, topic:6256"]
Will tvm consider integrating FBGEMM to get the same heavy lifting in the
future as pytorch has done to support the same high speedup in avx2 device?
[/quote]
No. We should rather improve our avx2 schedule
[quote="kindlehe, post:26, topic:6256"]
Will tvm consider integrating FBGEMM to get the same heavy lifting in the
future as pytorch has done to support the same high speedup in avx2 device?
[/quote]
No. We should rather improve our avx2 schedule to match FBGEMM performance.
---
[Visit
Top
For rasp3 and rasp4, we saw 1.3x - 1.5x performance speedup going from FP32 to
Int8.
The link comparing QNNPACK and TVM is not upstream'd yet. If I understand
correctly, it will be sometime before the authors of that work will be able to
make it to upstream. There are some differences in unde
[quote="masahi, post:25, topic:6256"]
https://github.com/pytorch/FBGEMM
[/quote]
Will tvm consider integrating FBGEMM to get the same heavy lifting in the
future as pytorch has done to support the same high speedup in avx2 device?
---
[Visit
Topic](https://discuss.tvm.ai/t/is-there-any-s
Yes it is incredible. Quantized Torch uses FBGEMM
https://github.com/pytorch/FBGEMM to do the heavy lifting. They jit generate
asm. I have no idea how their quantized convolution is implemented. You can
take a look at their code.
---
[Visit
Topic](https://discuss.tvm.ai/t/is-there-any-sp
@masahi
I wonder why pytorch can run so fast?
Is it because pytorch use int8 in the same macbook pro, or other speed-up
technique?
---
[Visit
Topic](https://discuss.tvm.ai/t/is-there-any-speed-comparison-of-quantization-on-cpu/6256/24)
to respond.
You are receiving this because you enab
Yes, int16 thing is intended. See
https://github.com/apache/incubator-tvm/pull/4307. @anijain2305 can give more
details.
Int8 is only enabled for AVX512.
---
[Visit
Topic](https://discuss.tvm.ai/t/is-there-any-speed-comparison-of-quantization-on-cpu/6256/23)
to respond.
You are receivi
The speed is tested on 2 cores for tvm and 1 core for torch,
so tvm@mobilenet-v3 is faster thant torch@mobilenet-v3
---
[Visit
Topic](https://discuss.tvm.ai/t/is-there-any-speed-comparison-of-quantization-on-cpu/6256/22)
to respond.
You are receiving this because you enabled mailing list
@masahi @anijain2305
I am not very sure whether INT8 is used in `perf_bench`, due to I see these log:
```
autotvm:Cannot find config for target=llvm -mcpu=core-avx2,
workload=('dense_nopack.x86', ('TENSOR', (1, 1280), 'int16'), ('TENSOR', (1000,
1280), 'int16'), None, 'int32'). A fallback con
@masahi I set `os.environ["TVM_NUM_THREADS"] = str(2)`, but it does not help
to the speed.
I also watch the cpu% of `tvm_model.module.time_evaluator` and `pt_model(inp)`
by `top` command,
the cpu%<=100%, it maybe means that both tvm and torch only use one thread to
do inference.
Here is the
How much speedup does FP32 compared INT8 at rasp4?1.5×?
I saw some speedup conclusion
[here](https://github.com/tvmai/meetup-slides/tree/master/tvm-meetup-shanghai-Nov-16-2019)
saying that tvm is about 1.3×(=2.08/1.60)at mobilenet-v2@rasp 3b+AARCH64 than
QNNPACK.
They reported apparent speed
I have some notes here, but they are a bit dated, and some things were specific
to my custom branch. But it may give you some hints
https://docs.google.com/document/d/1NTcjdmtW00Nnn7SyCDYUA-UXXhEDyGHz650qXsHxXvc/edit?usp=sharing
---
[Visit Topic](https://discuss.tvm.ai/t/tvm-on-windows-se
Hi, sorry for the late reply.
I did try to build from source but I don't actually know whether it worked.
Could you give me a scratch step by step instruction what to do from the
beginning?
I think I just try to deinstall everything and do it again
---
[Visit Topic](https://discuss.tvm.ai/
@zchuang11 I've added the support in my git repo
https://github.com/kazum/tvm/tree/mx_equal. Can you give it a try?
---
[Visit Topic](https://discuss.tvm.ai/t/op-equal-is-not-supported/6303/2) to
respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from
Hey all! New to TVM and looking to get started. Question, as I can't find it in
the documentation explained.
On this page:
https://docs.tvm.ai/tutorials/relay_quick_start.html
A bit goes into optimizations, but it's not explained:
```
Users can specify the optimization level of the compilatio
@kindlehe TVM might not be optimized for target 'llvm -mcpu=core-avx2'. I would
suggest running it on CascadeLake. You would see major benefit.
For rasp4, if you are comparing FP32 vs Int8, yes I have seen performance
improvements. However, if you compare PyTorch (backed by QNNPACK) int8 vs TV
Most of the operators I mentioned are removed when you freeze the graph. For
the *IteratorV2* and *IteratorGetNext* I think those are preprocessing steps
that you can move out of the model and add them back when you do inference.
Look for loops that feed data in or do some pre-processing.
Hop
No, but I think @anijain2305 has done such comparison on rasp4.
---
[Visit
Topic](https://discuss.tvm.ai/t/is-there-any-speed-comparison-of-quantization-on-cpu/6256/16)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
her
thanks very much!
I will check TVM_NUM_THREADS tomorrow morning.
Have you ever compared the tvm speed of FP32 and INT8 at android arm cpu,do you
think tvm@INT8 will make better speed than tvm@FP32 at android device?
---
[Visit
Topic](https://discuss.tvm.ai/t/is-there-any-speed-comparison
hmm I don't know why TVM is faster on mobilenet v3. Maybe because this is a
newer model that Torch team hasn't optimized for. But please make sure you are
setting `TVM_NUM_THREADS` env var correctly (it should be the number of
physical cores)
The numbers seem consistent with what I've seen in
Here is the spped comparison of quantized pytorch model and converted tvm model
at macbook pro.
I have no idea why tvm is faster than torch for mobilenet-v3, but slower for
resnet-18, resnet-50 and mobilenet-v2?

---
[Visit
Topic
This problem is solved by rebuild tvm in a correct way.
---
[Visit
Topic](https://discuss.tvm.ai/t/is-there-any-speed-comparison-of-quantization-on-cpu/6256/12)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https
Hello,
As you said 'For example, ‘IteratorV2’, ‘IteratorGetNext’, ‘SaveV2’,
‘RestoreV2’, 'Assign’, and ‘Assert’. I know that those operators can be
avoided my changing the model '
I am training a NCF model by using [TensorFlow model ncf
code](https://github.com/tensorflow/models/tree/r1.12.
I am testing the mxnet_gluon_model 'center_net_resnet18_v1b_voc', casting an
error with:
```Python
Traceback (most recent call last):
File "arm64_centernet_rpc.py", line 282, in
graph, lib, params = build(target, target_host)
File "arm64_centernet_rpc.py", line 144, in build
mod,
@Arctanxy have you solved the problem? I met exact same problem with the latest
0.7dev1 version
---
[Visit
Topic](https://discuss.tvm.ai/t/tvmerror-check-failed-type-code-kdlfloat-8-vs-2-expected-float-but-get-object/5680/4)
to respond.
You are receiving this because you enabled mailing
I see. So it doesn't seem to be possible at the moment. Thank you anyway.
---
[Visit
Topic](https://discuss.tvm.ai/t/autotvm-graph-tuner-running-graph-tuner-without-autotvm/6286/4)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails,
35 matches
Mail list logo