https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63678
--- Comment #4 from Peter Bumbulis <peter.bumbulis at ianywhere dot com> --- (In reply to Peter Bumbulis from comment #2) > The referenced web page is incorrect. Look in the instruction set reference > manual > (https://software.intel.com/sites/default/files/managed/c6/a9/319433-020.pdf, > search for VPBLENDMW) or the intrinsics guide > (https://software.intel.com/sites/landingpage/IntrinsicsGuide/). > > These instructions blend 16 bit quantities: you can fit 16 of these in a > 256 bit register. For AVX512 it's a 32-bit constant. My mistake: it looks like the generated code only uses the low 8 bytes. Sorry for any wasted bandwidth.