everton.constantino added a comment.

@fhahn Ok I see what you mean now, this sounds like a doable path and might be 
able to cover architectures with specialized matrix multiplication instructions 
as well .

Just to see if I understand correctly I can add a matrix_add intrinsic, do a 
travesal looking for matrix_multiply and fuse both changing  
`LowerMatrixMultiplyFused` to support pre-loading the accumulator. Is that 
correct?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99433/new/

https://reviews.llvm.org/D99433

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to