register allocation with omp simd, FMA

rguenth at gcc dot gnu.org Mon, 18 Jun 2018 00:57:01 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86174


Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |alias, missed-optimization
             Target|                            |x86_64-*-*, i?86-*-*
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2018-06-18
                 CC|                            |rguenth at gcc dot gnu.org
          Component|c                           |tree-optimization
             Blocks|                            |53947
     Ever confirmed|0                           |1

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed.  There's two things in the way - first we transform the

        #pragma omp simd
        for (int kk=0; kk<Sk; kk++) {
          c[(i+ii)*p+k+kk] = C[ii][kk];
        }

loop to memcpy (we could simply avoid that for force_vectorize loops as a
hack).
And if we avoid that, for example with -fno-tree-loop-distribute-patterns then
we fail to elide the stores to C[].  That happens because unrolling doesn't
preserve restrict info and when vectorization makes C addressable it doesn't
make restrict info reflect that it doesn't alias with anything.

We also do not have a late enough scalarization pass that would elide
the array - we'd rely on LIM/DSE here.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug tree-optimization/86174] Poor vectorization/register allocation with omp simd, FMA

Reply via email to