what do you mean exaclty by "imperfect loop tiling"?

On the first issue, tensorization lets us essentially inline high-performance 
code that implements a matrix-matrix or matrix-vector multiplication inner-loop 
body. This is very useful when targeting special hardware intrinsics, like 
performing AVX512 based GEMV, or invoking an accelerator's tensor core ISA, or 
performing neat tricks like bit-serial operations with vectorized popcount on 
ARM CPUs.





---
[Visit 
Topic](https://discuss.tvm.ai/t/about-the-tensorization-interface/3477/2) to 
respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/e72721fc5cdf3a0a93067531bdc622ad3b4272af34a4569feaf1e3a5d3ddbe92).

Tianqi Chen, UW, Seattle, WA, 98105, United States
http://tracking.discuss.tvm.ai/tracking/unsubscribe?msgid=dXOlL_3vZrXgn5wx_seCMw2

Reply via email to