https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201
--- Comment #21 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Jakub Jelinek from comment #20)
> (In reply to Joel Yliluoma from comment #19)
> > If the function return type is changed to "unsigned short", the AVX code
> > with "vpextrb" will do a spurious "movzx eax, al" at the end — but if the
> > return type is "unsigned int", it will not. The code with "(v)movd" should
> > of course do it, if the vector element size is shorter than the return type.
>
> With movd there is a non-redundant movzxl %al, %eax after the movd in both
> unsigned short and unsigned int cases. For {,v}pextrb there is a pattern
> that makes the zero extension explicit in the IL:
> (insn 28 27 29 2 (set (subreg:SI (reg:QI 87 [ stmp_r_10.10 ]) 0)
> (zero_extend:SI (vec_select:QI (subreg:V16QI (reg:V2DI 121) 0)
> (parallel [
> (const_int 0 [0])
> ])))) 4165 {*vec_extractv16qi_zext}
> (expr_list:REG_DEAD (reg:V2DI 121)
> (nil)))
> and for unsigned int return type the combiner is able to combine that with
> the following
> (insn 29 28 34 2 (set (reg:SI 122 [ stmp_r_10.10 ])
> (zero_extend:SI (reg:QI 87 [ stmp_r_10.10 ]))) "pr91201-4.c":9:11
> 119 {*zero_extendqisi2}
> (expr_list:REG_DEAD (reg:QI 87 [ stmp_r_10.10 ])
> (nil)))
> but it isn't able to merge that for a different extension in the unsigned
> short return type.
I think an insn similar to
(define_insn "*vec_extract<PEXTR_MODE12:mode>_zext"
is missing. Like:
(define_insn "*vec_extractv16qi_zext_hi"
[(set (match_operand:HI 0 "register_operand" "=r,r")
(zero_extend:HI
(vec_select:QI
(match_operand:V16QI 1 "register_operand" "x,v")
(parallel
[(match_operand:SI 2 "const_0_to_15")]))))]
"TARGET_SSE4_1"
"@
%vpextrb\t{%2, %1, %k0|%k0, %1, %2}
vpextrb\t{%2, %1, %k0|%k0, %1, %2}"