http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119
Bug #: 51119 Summary: MATMUL slow for large matrices Classification: Unclassified Product: gcc Version: unknown Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: libfortran AssignedTo: unassig...@gcc.gnu.org ReportedBy: j...@gcc.gnu.org Compared to ATLAS BLAS on an AMD 10h processor, MATMUL on square matrices with n > 256 is around a factor of 8 slower. While I don't think it's worth spending the time on target-specific parameters and/or asm-coded inner kernel as high-performance BLAS implementations do, I suspect that a little effort towards cache blocking could improve things.