[Bug tree-optimization/71992] Missed BB SLP vectorization in GCC

rguenth at gcc dot gnu.org Mon, 25 Jul 2016 05:09:29 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71992


Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2016-07-25
            Version|tree-ssa                    |7.0
             Blocks|                            |53947
     Ever confirmed|0                           |1

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed.  I think doing it as

 [a, b, b, b] * [a, b, 3., 3.] + [3., c, a, a]

would be "optimal" (not factoring in vector construction cost of course).

The issue is how SLP construction works and the number of swaps / builds
from scalars do.

One issue is that we even try with a group-size of 5.  Fixing that
doesn't fix it though as we do not consider building a vector from scalars
until we tried to swap the parent op (and if that fails we don't go back
building children from scalars).  Only trying with a group size of 4
would also regress the case where we'd have split after the first element.

That said, the whole SLP discovery needs a different algorithmic approach
to fix cases like this.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug tree-optimization/71992] Missed BB SLP vectorization in GCC

Reply via email to