https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107451
--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> --- Interestingly the following variant of the testcase falls back to VMAT_ELEMENTWISE but does have the same problem there fixed up by later folding, but it will segfault when using -O2 -mavx2 -fno-vect-cost-model -fdisable-tree-vrp2 -fdisable-free-forwprop4 which then keeps the bogus _62 = *ivtmp_64; _61 = MEM[(const double *)ivtmp_64 + 8B]; ivtmp_60 = ivtmp_64 + _65; _59 = *ivtmp_60; _58 = MEM[(const double *)ivtmp_60 + 8B]; ivtmp_57 = ivtmp_60 + _65; vect_cst__56 = {_62, _61, _59, _58}; vect__4.7_55 = VEC_PERM_EXPR <vect_cst__56, vect_cst__56, { 0, 1, 0, 1 }>; that problem should be present even before the r11-6434 change. In fact this segfaults on the GCC 10 branch with just -O2 -ftree-loop-vectorize -mavx2 generating the same load/permute as trunk for the reduction (so there's some half-way "fix" on the later branches). Also broken with GCC 9.5. static void __attribute__((noipa)) setdot(int n, const double *x, int inc_x, const double *y, double * __restrict dot) { int i, ix = 0; for(i = 0; i < n; i++) { dot[i*4+0] = x[ix] * y[ix] ; dot[i*4+1] = x[ix+1] * y[ix+1] ; dot[i*4+2] = x[ix] * y[ix+1] ; dot[i*4+3] = x[ix+1] * y[ix] ; ix += inc_x ; } } int main(void) { double x[2] = {0, 0}, y[2] = {0, 0}; double dot[4]; setdot(1, x, 4096*4096, y, dot); return 0; }