[Bug tree-optimization/51179] poor vectorization on interlagos.

jakub at gcc dot gnu.org Tue, 22 Nov 2011 09:13:49 -0800

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51179


Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-11-22 
17:13:26 UTC ---
Your testcase doesn't ressemble the original, the inner for cycles need
clearing of the iteration variable.
double C[10][4], B[10][10], A[10][4];

void
test (void)
{
  int i = 0, j = 0, l = 0;

  for (l = 0; l < 10; l++)
    for (j = 0; j < 10; j += 2)
      for (i = 0; i < 4; i++)
        {
          C[j + 0][i] = C[j + 0][i] + A[l][i] * B[j + 0][l];
          C[j + 1][i] = C[j + 1][i] + A[l][i] * B[j + 1][l];
        }
}

is IMHO just a matter whether graphite can -floop-interchange this or not.
If you swap manually the l and j for lines, the generated code looks better,
though for some reason we unroll even the l loop which increases register
pressure too much.

[Bug tree-optimization/51179] poor vectorization on interlagos.

Reply via email to