https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63678
--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> --- (In reply to Peter Bumbulis from comment #2) > The referenced web page is incorrect. Look in the instruction set reference > manual > (https://software.intel.com/sites/default/files/managed/c6/a9/319433-020.pdf, > search for VPBLENDMW) or the intrinsics guide > (https://software.intel.com/sites/landingpage/IntrinsicsGuide/). > > These instructions blend 16 bit quantities: you can fit 16 of these in a > 256 bit register. For AVX512 it's a 32-bit constant. Your first reference is AVX512 documentation, _mm256_blend_epi16 is not _mm256_mask_blend_epi16. _mm256_blend_epi16 is for VPBLENDW instruction, and the https://software.intel.com/sites/landingpage/IntrinsicsGuide/ looks incorrect, because it doesn't describe what the VPBLENDW instruction does. In particular, it only has 8-bit immediate, and both 128-bit lanes are blended the same given that mask: IF (imm8[0] == 1) THEN DEST[15:0] <- SRC2[15:0] ELSE DEST[15:0] <- SRC1[15:0] IF (imm8[1] == 1) THEN DEST[31:16] <- SRC2[31:16] ELSE DEST[31:16] <- SRC1[31:16] IF (imm8[2] == 1) THEN DEST[47:32] <- SRC2[47:32] ELSE DEST[47:32] <- SRC1[47:32] IF (imm8[3] == 1) THEN DEST[63:48] <- SRC2[63:48] ELSE DEST[63:48] <- SRC1[63:48] IF (imm8[4] == 1) THEN DEST[79:64] <- SRC2[79:64] ELSE DEST[79:64] <- SRC1[79:64] IF (imm8[5] == 1) THEN DEST[95:80] <- SRC2[95:80] ELSE DEST[95:80] <- SRC1[95:80] IF (imm8[6] == 1) THEN DEST[111:96] <- SRC2[111:96] ELSE DEST[111:96] <- SRC1[111:96] IF (imm8[7] == 1) THEN DEST[127:112] <- SRC2[127:112] ELSE DEST[127:112] <- SRC1[127:112] IF (imm8[0] == 1) THEN DEST[143:128] <- SRC2[143:128] ELSE DEST[143:128] <- SRC1[143:128] IF (imm8[1] == 1) THEN DEST[159:144] <- SRC2[159:144] ELSE DEST[159:144] <- SRC1[159:144] IF (imm8[2] == 1) THEN DEST[175:160] <- SRC2[175:160] ELSE DEST[175:160] <- SRC1[175:160] IF (imm8[3] == 1) THEN DEST[191:176] <- SRC2[191:176] ELSE DEST[191:176] <- SRC1[191:176] IF (imm8[4] == 1) THEN DEST[207:192] <- SRC2[207:192] ELSE DEST[207:192] <- SRC1[207:192] IF (imm8[5] == 1) THEN DEST[223:208] <- SRC2[223:208] ELSE DEST[223:208] <- SRC1[223:208] IF (imm8[6] == 1) THEN DEST[239:224] <- SRC2[239:224] ELSE DEST[239:224] <- SRC1[239:224] IF (imm8[7] == 1) THEN DEST[255:240] <- SRC2[255:240] ELSE DEST[255:240] <- SRC1[255:240]