[TVM Discuss] [Questions] Is there any speed comparison of quantization on cpu

2020-04-08 Thread kindlehe via TVM Discuss
I revise the input name in [imagenet_test.py](https://github.com/Edgecortix-Inc/pytorch_quantization/blob/master/tvm_qnn_evaluation/imagenet_test.py) as following: ![image|690x218](upload://zZ5U3HUpfJvYzsQhgEyqfR3s6PI.png) But get the following error while execute for resnet18 model: ![image

[TVM Discuss] [Questions] Testonnx 很简单的代码,不知道什么错误

2020-04-08 Thread 张晨晨 via TVM Discuss
%3 = nn.batch_norm(%2, %mnasnet0_stage1_conv0_batchnorm0_gamma, %mnasnet0_stage1_conv0_batchnorm0_beta, %mnasnet0_stage1_conv0_batchnorm0_running_mean, %mnasnet0_stage1_conv0_batchnorm0_running_var, epsilon=1e-05f); %4 = %3.0; %5 = nn.prelu(%4, %mnasnet0_stage1_conv0_prelu0_alpha) in par

[TVM Discuss] [Questions] Testonnx 很简单的代码,不知道什么错误

2020-04-08 Thread 张晨晨 via TVM Discuss
tvm 0.6 , onnx 1.6.0 ,python3.5 .llvm 4.0 先发个正确的版本,这应该能说明我的环境没有问题 使用from_onnx.py里面的模型,super_resolution_0.2.onnx import onnx import numpy as np import tvm import tvm.relay as relay onnx_model = onnx.load('super_resolution_0.2.onnx') target = tvm.target.create('llvm') input_name = '1'

[TVM Discuss] [Questions] [autoTVM][Graph tuner] Running graph tuner without autoTVM

2020-04-08 Thread Yao Wang via TVM Discuss
Currently log files in tophub just store the best schedule for each workload. The idea for graph tuner is to select a schedule from topk(usually 20 - 30) best schedules from a workload so that we can minimize data layout transformation overhead. Thus we want to first do autotuning. --- [V

[TVM Discuss] [Questions] How to deploy TVM model on windows [ Inference ]

2020-04-08 Thread ml_learning_all_days via TVM Discuss
hi ,did you complete the task deploying model on windwos using tvm? --- [Visit Topic](https://discuss.tvm.ai/t/how-to-deploy-tvm-model-on-windows-inference/2807/3) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](h

[TVM Discuss] [Questions] Matrix Inversion

2020-04-08 Thread tmp via TVM Discuss
New to TVM, still getting used to the best ways to express algorithms in `te.compute`: Is there a better way to express a 2x3 matrix inverse than writing a long cascading `te.compute(...te.if_then_else((i,j) == (0,0), te.if_then_else(...` or using hybrid script? --- [Visit Topic](ht

[TVM Discuss] [Questions] [autoTVM][Graph tuner] Running graph tuner without autoTVM

2020-04-08 Thread Jeremiah Morrill via TVM Discuss
Hopefully someone can correct me if I'm wrong, but I believe the tophub logs are downloaded anytime you run relay's build(...). So I believe the answer to you question, is just don't run the autotvm and just go right to building. with relay.build_config(opt_level=4): graph, lib, params =

[TVM Discuss] [Questions] CUDA FP16 example

2020-04-08 Thread jonso via TVM Discuss
Thanks a lot. I've been playing around with this on a BERT model, but I'm hitting some issues when calling `relay.build` with opt level 3. The target is `cuda`. The error message looks like this: ``` unresolved intrinsic sqrt with return type float16x4 ``` It comes from `codegen_c.cc`. Does t

[TVM Discuss] [Questions] [autoTVM][Graph tuner] Running graph tuner with fallback configuration

2020-04-08 Thread giovannib via TVM Discuss
Hi. I am trying to use the graph tuner module on CPU. Normally, you would run autoTVM, pick the best configuration, then instantiate the graph tuner: executor = Tuner(graph, input_dict, records, target_op, target) In many cases, [the default schedules already provided good performance](

[TVM Discuss] [Questions] Is there any speed comparison of quantization on cpu

2020-04-08 Thread kindlehe via TVM Discuss
Yes,thanks again for your reply. I just verified [ tutorial_eager.py](https://github.com/Edgecortix-Inc/pytorch_quantization/blob/master/tutorial_eager.py) @torch-nightly(v1.6) @macbook pro, and get the 2-4x speed-up as the [static_quantization_tutorial](https://pytorch.org/tutorials/advanced

[TVM Discuss] [Questions] Is there any speed comparison of quantization on cpu

2020-04-08 Thread kindlehe via TVM Discuss
Yes,thanks again for your reply. --- [Visit Topic](https://discuss.tvm.ai/t/is-there-any-speed-comparison-of-quantization-on-cpu/6256/9) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/

[TVM Discuss] [Questions] Scan on a non-zero axis

2020-04-08 Thread Krzysztof Parzyszek via TVM Discuss
I ended up not using `scan`. I used `extern` tensor instead, and I wrote the generating function myself. --- [Visit Topic](https://discuss.tvm.ai/t/scan-on-a-non-zero-axis/5996/3) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails,

[TVM Discuss] [Questions] Is there any speed comparison of quantization on cpu

2020-04-08 Thread masahi via TVM Discuss
1. I don't have experience using QAT in Torch. I think post training quantization is easier to work with. In any case, post training quantization should be the first thing you should try. If you need extra accuracy, QAT may help. 2. Yes. See https://docs.tvm.ai/tutorials/frontend/deploy_qua

[TVM Discuss] [Questions] How to get the workgroup size in OpenCL codegen?

2020-04-08 Thread Sunzj via TVM Discuss
As the GPU vendor recommend "Tell the Compiler the Work-Group Size" by ``` __attribute__((reqd_work_group_size(X,Y,Z))). ``` in OpenCL kenrel function. I can't get the work group size in codegen, Could you share how to get the work group size? --- [Visit Topic](https://discuss.tvm.ai

[TVM Discuss] [Questions] How do you test the percentage of time spent on several CUDA kernels

2020-04-08 Thread ading via TVM Discuss
Hello! I wrote an op composed of four CUDA kernels, and now I want to optimize the op, so I need to know the time ratio of the four kernels. I tried nvprof but was unable to use it due to permission issues. Is there a similar test function in TVM? My current test code is as follows: mod