https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115833

Hongtao Liu <liuhongt at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |lin1.hu at intel dot com

--- Comment #4 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
> is a bit odd for the packing.  Possibly the target lacks a truncv4siv4hi
> operation (thus the explicit zero vector).  Possibly x86 lacks a
> pack-lowpart/pack-highpart insn.

We support truncv4siv4hi2 under AVX2, w/o AVX512, it generates shufb.

15390(define_expand "trunc<mode><pmov_dst_4_lower>2"
15391  [(set (match_operand:<pmov_dst_4> 0 "register_operand")
15392        (truncate:<pmov_dst_4>
15393          (match_operand:PMOV_SRC_MODE_4 1 "register_operand")))]
15394  "TARGET_AVX2"
15395{


bar(unsigned int __vector(4)):
        vpshufb xmm0, xmm0, XMMWORD PTR .LC0[rip]
        ret

w/o AVX2, it's lower to 

  _12 = VEC_PACK_TRUNC_EXPR <_9, { 0, 0, 0, 0 }>;
  _13 = BIT_FIELD_REF <_12, 64, 0>;

vec_pack_trunc_expr uses packusdw with upper 16-bit cleared.

The optab can be extended to TARGET_SSSE3 which supports pshufb.

Reply via email to