On Sat, 29 Sep 2018, H.J. Lu wrote:

Add pmovzx/pmovsx patterns with SI and DI operands for pmovzx/pmovsx
instructions which only read the low 4 or 8 bytes from the source.

Hello,

I am wondering a few things (these are questions, I am not asking for changes):

Should we change the builtin and make it take a shorter argument, so it is visible to gimple optimizers that the high part is unused? But then would that shorter type be v8qi (we don't really have that type) or di (risks trying to use general regs?)?

+(define_insn "*sse4_1_<code>v8qiv8hi2<mask_name>"
+  [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
+       (any_extend:V8HI
+         (vec_select:V8QI
+           (subreg:V16QI
+             (vec_concat:V2DI
+               (match_operand:DI 1 "nonimmediate_operand" "Yrm,*xm,vm")
+               (const_int 0)) 0)
+           (parallel [(const_int 0) (const_int 1)
+                      (const_int 2) (const_int 3)
+                      (const_int 4) (const_int 5)
+                      (const_int 6) (const_int 7)]))))]

There is code in simplify-rtx.c that handles (vec_select (vec_concat x
y) z) when vec_select only picks from x. We could extend it to handle an
intermediate subreg/cast, which would yield something like:
(any_extend:V8HI (subreg:V8QI (match_operand:DI)))
or maybe even
(any_extend:V8HI (match_operand:V8QI))
Would this be likely to work? Is it desirable?

--
Marc Glisse

Reply via email to