https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107748

--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
float
_mm_cvtsbh_ss (__bf16 __A)
{
  union{ float sf; __bf16 bf[2];} __tmp;
  __tmp.sf = 0.0f;
  __tmp.bf[1] = __A;
  return __tmp.sf;
}

Looks like gcc can optimize it to

_mm_cvtsbh_ss(bool _Accum):
        movd    %xmm0, %eax
        sall    $16, %eax
        movd    %eax, %xmm0
        ret

Reply via email to