https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70355
--- Comment #2 from Richard Henderson <rth at gcc dot gnu.org> --- Created attachment 38113 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38113&action=edit proposed patch Testing the following, which works on the reduced test case. As a missed-optimization, we really ought to be handling logic operations on these wide types via normal sse logic insns. Perhaps not for V1TI, but definitely for V2TI and V4TI. There's no point in breaking them down into 4 and 8 DImode operations, respectively.