https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120927
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> --- Testcase that segfaults at runtime with -O3 -mavx512bw -mavx512vl --param vect-partial-vector-usage=1 #include <vector> std::vector<double> quadrature_points; double weights[5]; static double __attribute__((aligned(64))) wts[]{2., 2., 2., 2., 5.}; void __attribute__((noipa)) foo(unsigned n) { for (unsigned i = 0; i < n; ++i) quadrature_points[i] = weights[i] = wts[i]; } int main() { quadrature_points.push_back (0.); quadrature_points.push_back (0.); quadrature_points.push_back (0.); quadrature_points.push_back (0.); quadrature_points.push_back (0.); foo (5); } or alternatively the C testcase static const double a[] = { 1., 2., 3., 4., 5. }; void __attribute__((noipa)) foo (double *b, double *bp, double c, int n) { for (int i = 0; i < n; ++i) b[i] = bp[i] = a[i] * c; } int main() { double b[5], bp[5]; foo (b, bp, 3., 5); } The reason is we run into static bool vect_need_peeling_or_partial_vectors_p (loop_vec_info loop_vinfo) { ... else if (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) /* ??? When peeling for gaps but not alignment, we could try to check whether the (variable) niters is known to be VF * N + 1. That's something of a niche case though. */ || LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) || !LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant (&const_vf) || ((tree_ctz (LOOP_VINFO_NITERS (loop_vinfo)) < (unsigned) exact_log2 (const_vf)) /* In case of versioning, check if the maximum number of iterations is greater than th. If they are identical, the epilogue is unnecessary. */ && (!LOOP_REQUIRES_VERSIONING (loop_vinfo) || ((unsigned HOST_WIDE_INT) max_niter /* We'd like to use LOOP_VINFO_VERSIONING_THRESHOLD but that's only computed later based on our result. The following is the most conservative approximation. */ > (std::max ((unsigned HOST_WIDE_INT) th, const_vf) / const_vf) * const_vf)))) return true; return false; which decides that peeling or partial vectors are _not_ necessary as we are versioning for aliasing and max_niter (== 5) > 8. But we use LOOP_VINFO_COST_MODEL_THRESHOLD which isn't even computed yet. Also the code uses > rather than ==, so it's wrong, at least for the partial vector case. OTOH we should never even consider an epilogue with a gt VF than its main loop.