Did you mean LOG_BLOCK=4 or just BLOCK=4?

If LOG_BLOCK=4, that means BLOCK_IN and BLOCK_OUT would be 16. Therefore, in a 
GEMM instruction, it would perform 16x16 fused-multiply-add (MAC), that is 256 
MACs in a single GEMM instruction. In my calculation,

256 MACs * 0.142 GHz = 36.352 GOps

Note that, in current Chisel implement, a single GEMM would take 4 
cycles/stages to complete. We might see a performance regression on that.





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/vta-how-can-i-calculate-vta-gops/8192/2)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/fe7f0ce258295c66686dff7ebc1d808c686c836647c64f267f668f8819ca201d).

Reply via email to