Hi, I am trying to figure out the reason that Ansor performs better than 
Autotvm on GEMM cases. After comparing Ansor's generated schedule and the 
'dense_large_batch' template in topi for CUDA code generation, I find the main 
difference is the tiling strategy on the output stage. Ansor tries to apply a 
more complicated tiling pattern and also tile the output's cache stage 
('out.local') in the same way. I wonder why Ansor performs the same pattern of 
tiling on the cache stage (seems generated by the 'FollowTiling' function)? I 
have never found this pattern of scheduling in TVM's templates. Thanks a lot 
for any explanation! :smiley:

![follow tiling|690x290](upload://uApAKIjAQ4w7iYC7aJjLedfF4TS.png)





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/question-about-tiling-strategy-in-ansor/10026/1)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/5734bd8dafbb507cfb7094e97d95a1d72ee82dc8adef59ef4d79487dc7268f6f).

Reply via email to