What was the final speed comparison with pytorch here? 

It turned out that specializing with d1 and d2 as constants was the key part? 
(no need for autotvm?)





---
[Visit 
Topic](https://discuss.tvm.ai/t/optimizing-matrix-multiplication-for-gpu/4212/23)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/75ca841b34ae7ac3c7915c8dee45d1ff91d38c9c9c35407ed01197b474a1a56f).

Reply via email to