https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201

--- Comment #21 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Jakub Jelinek from comment #20)
> (In reply to Joel Yliluoma from comment #19)
> > If the function return type is changed to "unsigned short", the AVX code
> > with "vpextrb" will do a spurious "movzx eax, al" at the end — but if the
> > return type is "unsigned int", it will not. The code with "(v)movd" should
> > of course do it, if the vector element size is shorter than the return type.
> 
> With movd there is a non-redundant movzxl %al, %eax after the movd in both
> unsigned short and unsigned int cases.  For {,v}pextrb there is a pattern
> that makes the zero extension explicit in the IL:
> (insn 28 27 29 2 (set (subreg:SI (reg:QI 87 [ stmp_r_10.10 ]) 0)
>         (zero_extend:SI (vec_select:QI (subreg:V16QI (reg:V2DI 121) 0)
>                 (parallel [
>                         (const_int 0 [0])
>                     ])))) 4165 {*vec_extractv16qi_zext}
>      (expr_list:REG_DEAD (reg:V2DI 121)
>         (nil)))
> and for unsigned int return type the combiner is able to combine that with
> the following
> (insn 29 28 34 2 (set (reg:SI 122 [ stmp_r_10.10 ])
>         (zero_extend:SI (reg:QI 87 [ stmp_r_10.10 ]))) "pr91201-4.c":9:11
> 119 {*zero_extendqisi2}
>      (expr_list:REG_DEAD (reg:QI 87 [ stmp_r_10.10 ])
>         (nil)))
> but it isn't able to merge that for a different extension in the unsigned
> short return type.

I think an insn similar to

(define_insn "*vec_extract<PEXTR_MODE12:mode>_zext"

is missing. Like:

(define_insn "*vec_extractv16qi_zext_hi"
  [(set (match_operand:HI 0 "register_operand" "=r,r")
        (zero_extend:HI
          (vec_select:QI
            (match_operand:V16QI 1 "register_operand" "x,v")
            (parallel
              [(match_operand:SI 2 "const_0_to_15")]))))]
  "TARGET_SSE4_1"
  "@
   %vpextrb\t{%2, %1, %k0|%k0, %1, %2}
   vpextrb\t{%2, %1, %k0|%k0, %1, %2}"

Reply via email to