fhahn added a comment.

In D99433#2661919 <https://reviews.llvm.org/D99433#2661919>, 
@everton.constantino wrote:

> @fhahn Ok I see what you mean now, this sounds like a doable path and might 
> be able to cover architectures with specialized matrix multiplication 
> instructions as well .
>
> Just to see if I understand correctly I can add a matrix_add intrinsic, do a 
> travesal looking for matrix_multiply and fuse both changing  
> `LowerMatrixMultiplyFused` to support pre-loading the accumulator. Is that 
> correct?

Yes that sounds like a good path forward! I think at the moment, adding a 
matrix_mul_add intrinsic may be a bit premature, as we can just match & lower 
directly in place, as we already do in `LowerMatrixMultiplyFused`. Once we add 
more and more such transforms, it may really help to have additional intrinsics 
(or we could just create our own dummy declarations which are just used during 
the matrix lowering, to avoid adding too many intrinsics). But for now I think 
can move along faster without adding a new intrinsic.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99433/new/

https://reviews.llvm.org/D99433

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to