https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115833
Hongtao Liu <liuhongt at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |lin1.hu at intel dot com --- Comment #4 from Hongtao Liu <liuhongt at gcc dot gnu.org> --- > is a bit odd for the packing. Possibly the target lacks a truncv4siv4hi > operation (thus the explicit zero vector). Possibly x86 lacks a > pack-lowpart/pack-highpart insn. We support truncv4siv4hi2 under AVX2, w/o AVX512, it generates shufb. 15390(define_expand "trunc<mode><pmov_dst_4_lower>2" 15391 [(set (match_operand:<pmov_dst_4> 0 "register_operand") 15392 (truncate:<pmov_dst_4> 15393 (match_operand:PMOV_SRC_MODE_4 1 "register_operand")))] 15394 "TARGET_AVX2" 15395{ bar(unsigned int __vector(4)): vpshufb xmm0, xmm0, XMMWORD PTR .LC0[rip] ret w/o AVX2, it's lower to _12 = VEC_PACK_TRUNC_EXPR <_9, { 0, 0, 0, 0 }>; _13 = BIT_FIELD_REF <_12, 64, 0>; vec_pack_trunc_expr uses packusdw with upper 16-bit cleared. The optab can be extended to TARGET_SSSE3 which supports pshufb.