https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106565
--- Comment #7 from Steve Kargl <sgk at troutmask dot apl.washington.edu> --- On Tue, Aug 09, 2022 at 05:17:57PM +0000, quanhua.liu at noaa dot gov wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106565 > > --- Comment #5 from Quanhua Liu <quanhua.liu at noaa dot gov> --- > Hi Richard, > > Using -fexternal-blas for gfortran v10.3.0 is much slower than > the method 2: > BB = transpose(B) > C = matmul(A, BB) > > How about on your machine? > > > > > If you are doing a problem of this size or larger, you want to use the > > -fexternal-blas option and link in OpenBLAS. I wrote "and link in OpenBLAS". > > I added timing code and replicated the loop to both in one go. > > > > % gfcx -o z -O3 -march=native a.f90 && ./z > > 1.16500998 1615.08594 > > 5.32258606 1615.08020 > > % gfcx -o z -O3 -march=native a.f90 -fexternal-blas -lopenblas && ./z > > 2.44668889 1615.08301 > > 1.99379802 1615.08301 Method 1 is faster with OpenBLAS.