http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56764



             Bug #: 56764

           Summary: vect_prune_runtime_alias_test_list not smart enough

    Classification: Unclassified

           Product: gcc

           Version: 4.8.0

            Status: UNCONFIRMED

          Keywords: missed-optimization

          Severity: normal

          Priority: P3

         Component: tree-optimization

        AssignedTo: unassig...@gcc.gnu.org

        ReportedBy: ja...@gcc.gnu.org





__attribute__((noinline, noclone)) void

foo (float x[3][32], float y1, float y2, float y3, float *z1, float *z2, float

*z3)

{

  int i;

  for (i = 0; i < 32; i++)

    {

      z1[i] = -y1 * x[0][i];

      z2[i] = -y2 * x[1][i];

      z3[i] = -y3 * x[2][i];

    }

}



float x[6][32] __attribute__((aligned (32)));



int

main ()

{

  int i;

  for (i = 0; i < 32; i++)

    {

      x[0][i] = i;

      x[1][i] = 7 * i;

      x[2][i] = -5.5 * i;

    }

  for (i = 0; i < 100000000; i++)

    foo (&x[0], 12.5, 0.5, -1.5, &x[3][0], &x[4][0], &x[5][0]);

  return 0;

}



isn't vectorized on x86_64-linux with -O3 -mavx, because there are too many

versioning checks for alias.  We vectorize it only with

--param vect-max-version-for-alias-checks=12 .  But I don't see why we'd need

to emit that many checks for versioning, instead of the 12 checks for aliasing

we emit we could emit just 6 (keep the 3 overlap checks in between z1, z2 and

z3

and just merge each of the zN vs. &x[0][0], zN vs. &x[1][0] and zN vs. &x[2][0]

tests into one comparing zN[0] though zN[31] range with &x[0][0] through

&x[2][31].  Similarly, if we wanted to do a runtime check for alignment (not

the case on x86_64 apparently), we could test only alignment of &x[0][0],

because

it is provably the same alignment as &x[1][0] and &x[2][0].

Reply via email to