https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120927

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
Testcase that segfaults at runtime with -O3 -mavx512bw -mavx512vl --param
vect-partial-vector-usage=1

#include <vector>

std::vector<double> quadrature_points;
double weights[5];
static double  __attribute__((aligned(64))) wts[]{2., 2., 2., 2., 5.};
void __attribute__((noipa)) foo(unsigned n)
{
  for (unsigned i = 0; i < n; ++i)
    quadrature_points[i] = weights[i] = wts[i];
}
int main()
{
  quadrature_points.push_back (0.);
  quadrature_points.push_back (0.);
  quadrature_points.push_back (0.);
  quadrature_points.push_back (0.);
  quadrature_points.push_back (0.);
  foo (5);
}


or alternatively the C testcase

static const double a[] = { 1., 2., 3., 4., 5. };

void __attribute__((noipa))
foo (double *b, double *bp, double c, int n)
{
  for (int i = 0; i < n; ++i)
    b[i] = bp[i] = a[i] * c;
}

int main()
{
  double b[5], bp[5];
  foo (b, bp, 3., 5);
}


The reason is we run into

static bool
vect_need_peeling_or_partial_vectors_p (loop_vec_info loop_vinfo)
{
...
  else if (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo)
      /* ??? When peeling for gaps but not alignment, we could
         try to check whether the (variable) niters is known to be
         VF * N + 1.  That's something of a niche case though.  */
      || LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)
      || !LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant (&const_vf)
      || ((tree_ctz (LOOP_VINFO_NITERS (loop_vinfo))
           < (unsigned) exact_log2 (const_vf))
          /* In case of versioning, check if the maximum number of
             iterations is greater than th.  If they are identical,
             the epilogue is unnecessary.  */
          && (!LOOP_REQUIRES_VERSIONING (loop_vinfo)
              || ((unsigned HOST_WIDE_INT) max_niter
                  /* We'd like to use LOOP_VINFO_VERSIONING_THRESHOLD
                     but that's only computed later based on our result.
                     The following is the most conservative approximation.  */
                  > (std::max ((unsigned HOST_WIDE_INT) th,
                               const_vf) / const_vf) * const_vf))))
    return true;

  return false;

which decides that peeling or partial vectors are _not_ necessary as
we are versioning for aliasing and max_niter (== 5) > 8.

But we use LOOP_VINFO_COST_MODEL_THRESHOLD which isn't even computed yet.
Also the code uses > rather than ==, so it's wrong, at least for
the partial vector case.

OTOH we should never even consider an epilogue with a gt VF than its
main loop.

Reply via email to