CPI rate is a little high. One reason is maybe we generate too many redundancy instructions. So tensorize GEMM core part maybe is one solution. As you have performed better than oneDNN, you could compute the efficiency of CPU (like 60%, 70% or ...), if you have reached like 98% efficiency, you maybe hardly to improve next.
--- [Visit Topic](https://discuss.tvm.ai/t/how-to-further-improve-the-performance-of-given-schedule/7711/6) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/91c7de136b2013764f2e9771749637c6ee2b2b11d883f786827edc79b7f33fcb).