https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101058
--- Comment #6 from Uroš Bizjak <ubizjak at gmail dot com> --- (In reply to Jakub Jelinek from comment #5) We can split directly to sse2_pshuflw_1, avoiding mmx_pshufw_1. These two actually generate the same instruction (PSHUFLW) when XMM registers are involved.