http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119
Bug #: 51119
Summary: MATMUL slow for large matrices
Classification: Unclassified
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: libfortran
AssignedTo: [email protected]
ReportedBy: [email protected]
Compared to ATLAS BLAS on an AMD 10h processor, MATMUL on square matrices with
n > 256 is around a factor of 8 slower.
While I don't think it's worth spending the time on target-specific parameters
and/or asm-coded inner kernel as high-performance BLAS implementations do, I
suspect that a little effort towards cache blocking could improve things.