https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106565

--- Comment #7 from Steve Kargl <sgk at troutmask dot apl.washington.edu> ---
On Tue, Aug 09, 2022 at 05:17:57PM +0000, quanhua.liu at noaa dot gov wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106565
> 
> --- Comment #5 from Quanhua Liu <quanhua.liu at noaa dot gov> ---
> Hi Richard,
> 
> Using -fexternal-blas for gfortran v10.3.0 is much slower than
> the method 2:
>    BB = transpose(B)
>    C = matmul(A, BB)
> 
> How about on your machine?
> 
> >
> > If you are doing a problem of this size or larger, you want to use the
> > -fexternal-blas option and link in OpenBLAS.


I wrote "and link in OpenBLAS".

> > I added timing code and replicated the loop to both in one go.
> >
> > % gfcx -o z -O3 -march=native a.f90 && ./z
> >     1.16500998       1615.08594
> >     5.32258606       1615.08020


> > % gfcx -o z -O3 -march=native a.f90 -fexternal-blas -lopenblas && ./z
> >     2.44668889       1615.08301
> >     1.99379802       1615.08301

Method 1 is faster with OpenBLAS.

Reply via email to