Thank you very much for your reply.

As I said before, I refer to this tutorial to deploy tvm: 
https://tvm.apache.org/docs/deploy/cpp_deploy.html. I export tvm.build function 
as a library first, then load and call the function in C++.

According to your suggestion, I set the cpu affinity in this way before calling 
tvm::runtime::Module::LoadFromFile:
> tvm::runtime::threading::ThreadGroup::AffinityMode mode = 
> static_cast<tvm::runtime::threading::ThreadGroup::AffinityMode>(static_cast<int>(-1));
> tvm::runtime::ThreadPool::ThreadLocal()->UpdateWorkerConfiguration(mode, 4);

The of each of my CPU is shown below:
>     index: 7  freqs: 3130000
>     index: 4  freqs: 2544000
>     index: 5  freqs: 2544000
>     index: 6  freqs: 2544000
>     index: 0  freqs: 2045000
>     index: 1  freqs: 2045000
>     index: 2  freqs: 2045000
>     index: 3  freqs: 2045000

Then, I unset TVM_NUM_THREADS and tested many times.Compared with 
before(TVM_NUM_THREADS=1), the performance is indeed better. However, the 
time-consuming fluctuation is relatively large. For 256 * 256 * 256, the 
minimum time-consuming can reach 1745us, and the maximum time-consuming can 
reach 10971us.





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/strassen-algorithm-for-dense/2661/13) 
to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/ea2da04d2ff538f2c5b7f10dde5dabcacb62ff5258867deb1ff7e8f317b02a99).

Reply via email to