https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104978
--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Hongyu Wang <hong...@gcc.gnu.org>: https://gcc.gnu.org/g:7bce0be03b857eefe5990c3ef0af06ea8f8ae04e commit r12-7747-g7bce0be03b857eefe5990c3ef0af06ea8f8ae04e Author: Hongyu Wang <hongyu.w...@intel.com> Date: Sat Mar 19 01:16:29 2022 +0800 AVX512FP16: Fix wrong code for _mm_mask_f[c]madd.*sch [PR 104978] For complex scalar intrinsic like _mm_mask_fcmadd_sch, the mask should be and by 1 to ensure the mask is bind to lowest byte. Use masked vmovss to perform same operation which omits higher bits of mask. gcc/ChangeLog: PR target/104978 * config/i386/sse.md (avx512fp16_fmaddcsh_v8hf_mask1<round_expand_name): Use avx512f_movsf_mask instead of vmovaps or vblend, and force_reg before lowpart_subreg. (avx512fp16_fcmaddcsh_v8hf_mask1<round_expand_name): Likewise. gcc/testsuite/ChangeLog: PR target/104978 * gcc.target/i386/avx512fp16-vfcmaddcsh-1a.c: Adjust asm scan. * gcc.target/i386/avx512fp16-vfmaddcsh-1a.c: Ditto. * gcc.target/i386/avx512fp16-vfcmaddcsh-1c.c: Removed. * gcc.target/i386/avx512fp16-vfmaddcsh-1c.c: Ditto. * gcc.target/i386/pr104978.c: New test.