Hello world, there still was one piece missing for the new matmul library version. To make sure that users (usually) benefit, we need to call the library by default up from a certain limit. The attached patch does that, with a limit of 30, which seems to be reasonable given a few benchmarks.
Some test cases had to be changed to scan the optimized tree instead of the original because the version still had some if (0) statement in them. Regeression-tested. OK for trunk? Regards Thomas 2017-02-25 Thomas Koenig <tkoe...@gcc.gnu.org> PR fortran/51119 * options.c (gfc_post_options): Set default limit for matmul inlining to 30. * invoke.texi: Document change. 2017-02-25 Thomas Koenig <tkoe...@gcc.gnu.org> PR fortran/51119 * gfortran.dg/inline_matmul_1.f90: Scan optimized dump instead of original. * gfortran.dg/inline_matmul_11.f90: Likewise. * gfortran.dg/inline_matmul_9.f90: Likewise. * gfortran.dg/matmul_13.f90: New test. * gfortran.dg/matmul_14.f90: New test.
Index: fortran/invoke.texi =================================================================== --- fortran/invoke.texi (Revision 245564) +++ fortran/invoke.texi (Arbeitskopie) @@ -1630,7 +1630,7 @@ square, the size comparison is performed using the the dimensions of the argument and result matrices. The default value for @var{n} is the value specified for -@code{-fblas-matmul-limit} if this option is specified, or unlimitited +@code{-fblas-matmul-limit} if this option is specified, or 30 otherwise. @item -frecursive Index: fortran/options.c =================================================================== --- fortran/options.c (Revision 245564) +++ fortran/options.c (Arbeitskopie) @@ -388,10 +388,16 @@ gfc_post_options (const char **pfilename) if (!flag_automatic) flag_max_stack_var_size = 0; - /* If we call BLAS directly, only inline up to the BLAS limit. */ + /* If the user did not specify an inline matmul limit, inline up to the BLAS + limit or up to 30 if no external BLAS is specified. */ - if (flag_external_blas && flag_inline_matmul_limit < 0) - flag_inline_matmul_limit = flag_blas_matmul_limit; + if (flag_inline_matmul_limit < 0) + { + if (flag_external_blas) + flag_inline_matmul_limit = flag_blas_matmul_limit; + else + flag_inline_matmul_limit = 30; + } /* Optimization implies front end optimization, unless the user specified it directly. */ Index: testsuite/gfortran.dg/inline_matmul_1.f90 =================================================================== --- testsuite/gfortran.dg/inline_matmul_1.f90 (Revision 245564) +++ testsuite/gfortran.dg/inline_matmul_1.f90 (Arbeitskopie) @@ -1,5 +1,5 @@ ! { dg-do run } -! { dg-options "-ffrontend-optimize -fdump-tree-original -Wrealloc-lhs" } +! { dg-options "-ffrontend-optimize -fdump-tree-optimized -Wrealloc-lhs" } ! PR 37131 - check basic functionality of inlined matmul, making ! sure that the library is not called, with and without reallocation. @@ -149,4 +149,4 @@ program main end program main -! { dg-final { scan-tree-dump-times "_gfortran_matmul" 0 "original" } } +! { dg-final { scan-tree-dump-times "_gfortran_matmul" 0 "optimized" } } Index: testsuite/gfortran.dg/inline_matmul_11.f90 =================================================================== --- testsuite/gfortran.dg/inline_matmul_11.f90 (Revision 245564) +++ testsuite/gfortran.dg/inline_matmul_11.f90 (Arbeitskopie) @@ -1,5 +1,5 @@ ! { dg-do run } -! { dg-additional-options "-ffrontend-optimize -fdump-tree-original" } +! { dg-additional-options "-ffrontend-optimize -fdump-tree-optimized" } ! PR fortran/66176 - inline conjg for matml. program main complex, dimension(3,2) :: a @@ -29,4 +29,4 @@ program main c = matmul(conjg(a), b) if (any(conjg(c) /= res2)) call abort end program main -! { dg-final { scan-tree-dump-times "_gfortran_matmul" 0 "original" } } +! { dg-final { scan-tree-dump-times "_gfortran_matmul" 0 "optimized" } } Index: testsuite/gfortran.dg/inline_matmul_9.f90 =================================================================== --- testsuite/gfortran.dg/inline_matmul_9.f90 (Revision 245564) +++ testsuite/gfortran.dg/inline_matmul_9.f90 (Arbeitskopie) @@ -1,5 +1,5 @@ ! { dg-do run } -! { dg-options "-ffrontend-optimize -fdump-tree-original" } +! { dg-options "-ffrontend-optimize -fdump-tree-optimized" } ! PR 66041 - this used to ICE with an incomplete fix for the PR. program main implicit none @@ -21,4 +21,4 @@ program main if (any (c2-reshape([248., -749.],shape(c2)) /= 0.)) call abort end program main -! { dg-final { scan-tree-dump-times "_gfortran_matmul" 0 "original" } } +! { dg-final { scan-tree-dump-times "_gfortran_matmul" 0 "optimized" } }
! { dg-do compile } ! { dg-options "-O3 -fdump-tree-optimized" } ! Check that the default limit of 30 for inlining matmul applies. program main integer, parameter :: n = 31 real, dimension(n,n) :: a, b, c call random_number(a) call random_number(b) c = matmul(a,b) print *,sum(c) end program main ! { dg-final { scan-tree-dump-times "_gfortran_matmul_r4" 1 "optimized" } }
! { dg-do compile } ! { dg-options "-O3 -fdump-tree-optimized" } ! Check that the default limit of 30 for inlining matmul applies. program main integer, parameter :: n = 30 real, dimension(n,n) :: a, b, c call random_number(a) call random_number(b) c = matmul(a,b) print *,sum(c) end program main ! { dg-final { scan-tree-dump-times "_gfortran_matmul_r4" 0 "optimized" } }