fhahn added a comment. In D99433#2661919 <https://reviews.llvm.org/D99433#2661919>, @everton.constantino wrote:
> @fhahn Ok I see what you mean now, this sounds like a doable path and might > be able to cover architectures with specialized matrix multiplication > instructions as well . > > Just to see if I understand correctly I can add a matrix_add intrinsic, do a > travesal looking for matrix_multiply and fuse both changing > `LowerMatrixMultiplyFused` to support pre-loading the accumulator. Is that > correct? Yes that sounds like a good path forward! I think at the moment, adding a matrix_mul_add intrinsic may be a bit premature, as we can just match & lower directly in place, as we already do in `LowerMatrixMultiplyFused`. Once we add more and more such transforms, it may really help to have additional intrinsics (or we could just create our own dummy declarations which are just used during the matrix lowering, to avoid adding too many intrinsics). But for now I think can move along faster without adding a new intrinsic. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D99433/new/ https://reviews.llvm.org/D99433 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits