https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201

--- Comment #20 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Joel Yliluoma from comment #19)
> If the function return type is changed to "unsigned short", the AVX code
> with "vpextrb" will do a spurious "movzx eax, al" at the end — but if the
> return type is "unsigned int", it will not. The code with "(v)movd" should
> of course do it, if the vector element size is shorter than the return type.

With movd there is a non-redundant movzxl %al, %eax after the movd in both
unsigned short and unsigned int cases.  For {,v}pextrb there is a pattern that
makes the zero extension explicit in the IL:
(insn 28 27 29 2 (set (subreg:SI (reg:QI 87 [ stmp_r_10.10 ]) 0)
        (zero_extend:SI (vec_select:QI (subreg:V16QI (reg:V2DI 121) 0)
                (parallel [
                        (const_int 0 [0])
                    ])))) 4165 {*vec_extractv16qi_zext}
     (expr_list:REG_DEAD (reg:V2DI 121)
        (nil)))
and for unsigned int return type the combiner is able to combine that with the
following
(insn 29 28 34 2 (set (reg:SI 122 [ stmp_r_10.10 ])
        (zero_extend:SI (reg:QI 87 [ stmp_r_10.10 ]))) "pr91201-4.c":9:11 119
{*zero_extendqisi2}
     (expr_list:REG_DEAD (reg:QI 87 [ stmp_r_10.10 ])
        (nil)))
but it isn't able to merge that for a different extension in the unsigned short
return type.

Reply via email to