https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89049
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- In the assembly I notice vinsertf128 $0x1, 16(%rdi), %ymm4, %ymm2 ... vextractf128 $0x1, %ymm2, %xmm1 somehow we fail to elide the initial %ymm2 build with the upper half extraction being the only use... possibly because it has a memory operand?