[Bug target/104978] [avx512fp16] wrong code for _mm_mask_fcmadd_round_sch

cvs-commit at gcc dot gnu.org via Gcc-bugs Mon, 21 Mar 2022 20:49:08 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104978


--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Hongyu Wang <hong...@gcc.gnu.org>:

https://gcc.gnu.org/g:7bce0be03b857eefe5990c3ef0af06ea8f8ae04e

commit r12-7747-g7bce0be03b857eefe5990c3ef0af06ea8f8ae04e
Author: Hongyu Wang <hongyu.w...@intel.com>
Date:   Sat Mar 19 01:16:29 2022 +0800

    AVX512FP16: Fix wrong code for _mm_mask_f[c]madd.*sch [PR 104978]

    For complex scalar intrinsic like _mm_mask_fcmadd_sch, the
    mask should be and by 1 to ensure the mask is bind to lowest byte.
    Use masked vmovss to perform same operation which omits higher bits
    of mask.

    gcc/ChangeLog:

            PR target/104978
            * config/i386/sse.md
            (avx512fp16_fmaddcsh_v8hf_mask1<round_expand_name):
            Use avx512f_movsf_mask instead of vmovaps or vblend, and
            force_reg before lowpart_subreg.
            (avx512fp16_fcmaddcsh_v8hf_mask1<round_expand_name): Likewise.

    gcc/testsuite/ChangeLog:

            PR target/104978
            * gcc.target/i386/avx512fp16-vfcmaddcsh-1a.c: Adjust asm scan.
            * gcc.target/i386/avx512fp16-vfmaddcsh-1a.c: Ditto.
            * gcc.target/i386/avx512fp16-vfcmaddcsh-1c.c: Removed.
            * gcc.target/i386/avx512fp16-vfmaddcsh-1c.c: Ditto.
            * gcc.target/i386/pr104978.c: New test.

[Bug target/104978] [avx512fp16] wrong code for _mm_mask_fcmadd_round_sch

Reply via email to