https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68600
--- Comment #3 from Thomas Koenig <tkoenig at gcc dot gnu.org> --- Created attachment 36868 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36868&action=edit Modified benchmark (really this time) Hi Dominique, I think you are seeing the effects of inefficiencies of assumed-shape arrays. If you want to use matmul on very small matrix sizes, it is best to use fixed-size explicit arrays. Below the results of the modified benchmark (some changes to keep the optimizer honest, such as a call to a dummy subroutine) on my rather dated home box: Size Loops Matmul dgemm Matmul Matmul fixed explicit assumed variable explicit ===================================================================================== 2 200000 11.948 0.072 0.142 0.411 4 200000 1.711 0.417 0.534 0.861 8 200000 2.314 0.953 0.858 1.076 16 200000 1.745 1.276 0.918 1.000 32 200000 1.459 1.456 1.371 1.436 64 30757 1.501 1.440 1.360 1.393 128 3829 1.586 1.544 1.557 1.529 256 477 1.531 1.519 1.544 1.507 512 59 1.315 1.290 1.263 1.231 1024 7 1.110 1.081 1.069 1.053 2048 1 1.095 1.086 1.081 1.058