[Bug tree-optimization/114767] gfortran AVX2 complex multiplication by (0d0,1d0) suboptimal

roger at nextmovesoftware dot com via Gcc-bugs Thu, 18 Apr 2024 11:01:46 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114767


--- Comment #5 from Roger Sayle <roger at nextmovesoftware dot com> ---
Another interesting (simpler) case of -ffast-math pessimization is:
void foo(_Complex double *c)
{
    for (int i=0; i<16; i++)
      c[i] += __builtin_complex(1.0,0.0);
}

Again without -ffast-math we vectorize consecutive additions, but with
-ffast-math we (not so) cleverly avoid every second addition by producing
significantly larger code that shuffles the real/imaginary parts around.

This even suggests a missed-optimization for:
void bar(_Complex double *c, double x)
{
    for (int i=0; i<16; i++)
      c[i] += x;
}

which may be more efficiently implemented (when safe) by:
void bar(_Complex double *c, double x)
{
    for (int i=0; i<16; i++)
      c[i] += __builtin_complex(x,0.0);
}

i.e. insert/interleave a no-op zero addition, to simplify the vectorization.

The existence of a suitable identity operation (+0, *1.0, &~0, |0, ^0) can be
used to avoid shuffling/permuting values/lanes out of vectors, when its
possible for the vector operation to leave the other values unchanged.

[Bug tree-optimization/114767] gfortran AVX2 complex multiplication by (0d0,1d0) suboptimal

Reply via email to