http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52572
--- Comment #2 from Marc Glisse <marc.glisse at normalesup dot org> 2012-03-13 08:16:58 UTC --- (In reply to comment #1) > Have you actually tried that? Ah, no, sorry, I only have occasional access to such a machine to benchmark the code. From a -Os perspective it is still shorter (but indeed that matters less to me than -O3 performance). > Mixing VEX encoded insns with legacy encoded > SSE* insns is very costly, for good performance there needs to be a vzeroupper > in between (but then you lose the upper bits). See e.g. 2.8 in the AVX > Programming Reference. Thanks, I'd missed that. The vblendpd solution should still apply (from the initial 'v' it sounds safe), no?