https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122887

            Bug ID: 122887
           Summary: C double vs double complex vectorisation differences
           Product: gcc
           Version: 14.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: mjr19 at cam dot ac.uk
  Target Milestone: ---

I was comparing the compilation of code using C99 double complex with code
explicitly manipulating the individual components.

void foo(double *a, double *b, double cr, double ci, int n){
  int i;
  for(i=0;i<2*n;i+=2){
    b[2*i]=a[2*i]*cr-a[2*i+1]*ci;
    b[2*i+1]=a[2*i]*ci+a[2*i+1]*cr;
  }
}

is the real version, and, when compiled with

gcc-15.1 -O3 -march=core-avx2

it uses the full length of the ymm registers, and peforms a run-time check for
overlap of a and b (I think). Very good.

In contrast

#include <complex.h>
void foo(double complex *a, double complex *b, double complex c, int n){
  int i;
  for(i=0;i<n;i++)
    b[i]=a[i]*c;
}

does not vectorise beyond the xmm registers, even if "#pragma omp simd" is
added, and is significantly slower to run.

I must admit to having little experience of complex in C99, but am I wrong to
expect these two short examples to be effectively identical?

Reply via email to