https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98113
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot
gnu.org
Target Milestone|--- |11.0
Target| |x86_64-*-* s390x-*-*
Keywords| |missed-optimization
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
The most straight-forward approach would be to treat
r_14 = BIT_INSERT_EXPR <r_15(D), _18, 0 (32 bits)>;
r_33 = BIT_INSERT_EXPR <r_14, _27, 32 (32 bits)>;
r_32 = BIT_INSERT_EXPR <r_33, _36, 64 (32 bits)>;
r_31 = BIT_INSERT_EXPR <r_32, _3, 96 (32 bits)>;
itself as a SLP source much like we look for CTORs as SLP source. Note the
transformed load is an extra complication but at least I added support to
SLP existing vectors.
Also regresses on x86_64.
I'll see whether I can cook up sth.