https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122887
Bug ID: 122887
Summary: C double vs double complex vectorisation differences
Product: gcc
Version: 14.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: mjr19 at cam dot ac.uk
Target Milestone: ---
I was comparing the compilation of code using C99 double complex with code
explicitly manipulating the individual components.
void foo(double *a, double *b, double cr, double ci, int n){
int i;
for(i=0;i<2*n;i+=2){
b[2*i]=a[2*i]*cr-a[2*i+1]*ci;
b[2*i+1]=a[2*i]*ci+a[2*i+1]*cr;
}
}
is the real version, and, when compiled with
gcc-15.1 -O3 -march=core-avx2
it uses the full length of the ymm registers, and peforms a run-time check for
overlap of a and b (I think). Very good.
In contrast
#include <complex.h>
void foo(double complex *a, double complex *b, double complex c, int n){
int i;
for(i=0;i<n;i++)
b[i]=a[i]*c;
}
does not vectorise beyond the xmm registers, even if "#pragma omp simd" is
added, and is significantly slower to run.
I must admit to having little experience of complex in C99, but am I wrong to
expect these two short examples to be effectively identical?