[Bug tree-optimization/114767] gfortran AVX2 complex multiplication by (0d0,1d0) suboptimal

mjr19 at cam dot ac.uk via Gcc-bugs Thu, 18 Apr 2024 10:58:43 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114767


--- Comment #4 from mjr19 at cam dot ac.uk ---
An issue which I suspect is related is shown by

subroutine zradd(c,n)
  integer :: i,n
  complex(kind(1d0)) :: c(*)

  do i=1,n
     c(i)=c(i)+1d0
  enddo
end subroutine

If compiled with gfortran-14 and -O3 -mavx2 it all looks very sensible.

If one adds -ffast-math, it looks a lot less sensible, and takes over 70%
longer to run. I think it has changed from promoting 1d0 to (1d0,0d0) and then
adding that (which one might argue that a strict interpretation of the Fortran
standard requires, but I am not certain that it does), to collecting all the
real parts in a vector, adding 1d0 to them, and avoiding adding 0d0 to the
imaginary parts. Unsurprisingly the gain in halving the number of additions is
more than offset by the required vperms and vshufs.

Ideally -ffast-math would have noticed that adding 0d0 to the imaginary part is
not necessary, but then concluded that doing so was faster than any alternative
method, and so done so anyway.

[Bug tree-optimization/114767] gfortran AVX2 complex multiplication by (0d0,1d0) suboptimal

Reply via email to