https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101846

--- Comment #9 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Andrew Pinski from comment #7)
> With just -mavx512f we produce a bunch of instructions (looking like we went
> to scalar mode) while LLVM is able to produce:
> foo(short __vector(16)):                           # @foo(short __vector(16))
>         .cfi_startproc
> # %bb.0:
>         vpmovzxwd       ymm1, xmm0              # ymm1 =
> xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],
> zero,xmm0[6],zero,xmm0[7],zero
>         vextracti128    xmm0, ymm0, 1
>         vpmovzxwd       ymm0, xmm0              # ymm0 =
> xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],
> zero,xmm0[6],zero,xmm0[7],zero
>         vinserti64x4    zmm0, zmm1, ymm0, 1
>         ret
> 
> 
zero_extend from ymm to zmm is supported under avx512bw, LLVM breaks them into
2 zero extends from xmm to ymm, and then pack them back to zmm.

Reply via email to