https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115282

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
           Priority|P3                          |P1
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot 
gnu.org
             Target|powerpc64-linux-gnu         |powerpc64*-linux-gnu
             Status|NEW                         |ASSIGNED

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Ah, this is probably a case where we need to split because CSE causes us to
associate operations differently so SLP build for the whole thing fails.

The three-vector permute issue will go away when I manage to finish the load
part of the full SLP enablement.

It also fails on LE.  It's the

node 0x39913f0 (max_nunits=4, refcnt=2) vector(4) unsigned int
op template: _14 = in[_13];
    stmt 0 _14 = in[_13];
    load permutation { 6 }

note.  We split the 8-group into 6 and two times 1 element.  This needs
an intermediate (interleaving) permute and indeed the load part will fix it.

I suggest to leave this failing until then.  The loop is still vectorized
but using non-SLP full interleaving until then.

Reply via email to