https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120396

            Bug ID: 120396
           Summary: unprofitable SLP vectorization, leaves scalar parts
                    live
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: amonakov at gcc dot gnu.org
  Target Milestone: ---

A variant of PR 109892.

static double muladd(double x, double y, double z)
{
    return x * y + z;
}
double g(double x[], long n)
{
    double r0 = 0, r1 = 0;
    for (; n; x += 2, n--) {
        r0 = muladd(x[0], x[0], r0);
        r1 = muladd(x[1], x[1], r1);
        x[0] = r0;
        x[1] = r1;
    }
    return r0 + r1;
}

The SLP-vectorized loop at -O2 -mfma (or plain -O2 on AArch64) does strictly
more work than a scalar loop.

Reply via email to