https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98563
Bug ID: 98563
Summary: regression: vectorization fails while it worked on gcc
9 and earlier
Product: gcc
Version: 10.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: nathanael.schaeffer at gmail dot com
Target Milestone: ---
I have found what seems to be a regression.
The following code is not compiled to 256-bit AVX when compiled with
-fopenmp-simd, while it is fully vectorized without!
Here are the resulting code with different options, with gcc 10.1:
-O3 -fopenmp-simd => xmm
-O3 => ymm
-O3 -fopenmp-simd -fno-signed-zeros => ymm
gcc 9 and earlier always vectorize to full-width (ymm)
#include <complex>
typedef std::complex<double> cplx;
void test(cplx* __restrict__ a, const cplx* b, double c, int N)
{
#pragma omp simd
for (int i=0; i<8*N; i++) {
a[i] = c*(a[i]-b[i]);
}
}
See the result on godbolt: https://godbolt.org/z/9ThqKE
Also, I discover that no avx512 code is generated for this loop. Is this
intended? Is there an option to enable avx512 vectorization?