https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119209

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
                 CC|                            |rguenth at gcc dot gnu.org
   Last reconfirmed|                            |2025-03-11
     Ever confirmed|0                           |1

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
The issue is that the lane-combining pattern recognitions are restricted to
loop reductions because lane-order isn't preserved (or even well-defined).  The
decision to recognize the SLP as BB reduction comes after this.

The fix is probably to apply the reduction restriction only during SLP
build and vectorizable_* checking.

Nailing down which lanes are combined for V16QI->V4SI for the optab would
also allow to use dot_prod in non-reduction cases (when the V4SI intermediate
result isn't reduced to a single lane in the end).  There's a related PR about
this, but IIRC for the SAD patterns.

Reply via email to