https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98837

            Bug ID: 98837
           Summary: SLP discovery does not consider all lane permutes
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

While we SLP vectorize

int a[1024], b[1024], c[1024];

void foo ()
{
  for (int i = 0; i < 1024; i += 4)
    {
      c[i] = a[i] + b[i];
      c[i+1] = a[i+1] + b[i+1];
      c[i+2] = a[i+2] * b[i+2];
      c[i+3] = a[i+3] * b[i+3];
    }
}

by splitting the SLP group into two the very similar

int a[1024], b[1024], c[1024];

void foo ()
{
  for (int i = 0; i < 1024; i += 4)
    {
      c[i] = a[i] + b[i];
      c[i+1] = a[i+1] * b[i+1];
      c[i+2] = a[i+2] + b[i+2];
      c[i+3] = a[i+3] * b[i+3];
    }
}

is not SLPed because we do not consider splitting the group into
non-adjacent sets.  The same applies to basic-block SLP when
you make the data type double (so we don't need unrolling),
of course we simply fall back to a scalar build then.

Reply via email to