On Fri, 30 Nov 2012, Uros Bizjak wrote:

For reference, we are talking about:

(define_insn "<sse>_vm<plusminus_insn><mode>3"
 [(set (match_operand:VF_128 0 "register_operand" "=x,x")
        (vec_merge:VF_128
          (plusminus:VF_128
            (match_operand:VF_128 1 "register_operand" "0,x")
            (match_operand:VF_128 2 "nonimmediate_operand" "xm,xm"))
          (match_dup 1)
          (const_int 1)))]
 "TARGET_SSE"
 "@
  <plusminus_mnemonic><ssescalarmodesuffix>\t{%2, %0|%0, %2}
  v<plusminus_mnemonic><ssescalarmodesuffix>\t{%2, %1, %0|%0, %1, %2}"
 [(set_attr "isa" "noavx,avx")
  (set_attr "type" "sseadd")
  (set_attr "prefix" "orig,vex")
  (set_attr "mode" "<ssescalarmode>")])

No, looking at your description, the operand 2 should be scalar
operand (we use _s{s,d} scalar instruction here), and for doubles this
should refer to 64bit memory location. I don't remember all the
details about vec_merge scalar instructions, but it looks to me that
canonical representation should be more like your proposal:

+(define_insn "*sse2_vm<plusminus_insn>v2df3"
+  [(set (match_operand:V2DF 0 "register_operand" "=x,x")
+    (vec_concat:V2DF
+      (plusminus:DF
+        (vec_select:DF
+          (match_operand:V2DF 1 "register_operand" "0,x")
+          (parallel [(const_int 0)]))
+        (match_operand:DF 2 "nonimmediate_operand" "xm,xm"))
+      (vec_select:DF (match_dup 1) (parallel [(const_int 1)]))))]
+  "TARGET_SSE2"

Thank you.

Among the following possible patterns, my choice (if nobody objects) is to use 4) for V2DF and 3) (rewritten without iterators) for V4SF. The question is then what should be done about the builtins and intrinsics. _mm_add_sd takes two __m128. If I change the signature of __builtin_ia32_addsd, I can make _mm_add_sd pass __B[0] as second argument, but I don't know if I am allowed to change that signature. Otherwise I guess I'll need to keep a separate expander for it (I'd rather not). And then there are several other operations than +/- to handle.


1) Current pattern:

  [(set (match_operand:VF_128 0 "register_operand" "=x,x")
        (vec_merge:VF_128
          (plusminus:VF_128
            (match_operand:VF_128 1 "register_operand" "0,x")
            (match_operand:VF_128 2 "nonimmediate_operand" "xm,xm"))
          (match_dup 1)
          (const_int 1)))]

2) Minimal fix:

  [(set (match_operand:VF_128 0 "register_operand" "=x,x")
        (vec_merge:VF_128
          (plusminus:VF_128
            (match_operand:VF_128 1 "register_operand" "0,x")
            (vec_duplicate:VF_128
              (match_operand:<ssescalarmode> 2 "nonimmediate_operand" "xm,xm")))
          (match_dup 1)
          (const_int 1)))]

3) With the operation in scalar mode:

  [(set (match_operand:VF_128 0 "register_operand" "=x,x")
        (vec_merge:VF_128
          (vec_duplicate:VF_128
            (plusminus:<ssescalarmode>
              (vec_select:<ssescalarmode>
                (match_operand:VF_128 1 "register_operand" "0,x")
                (parallel [(const_int 0)]))
              (match_operand:<ssescalarmode> 2 "nonimmediate_operand" 
"xm,xm"))))
          (match_dup 1)
          (const_int 1)))]

4) Special version which only makes sense for vectors of 2 elements:

  [(set (match_operand:V2DF 0 "register_operand" "=x,x")
        (vec_concat:V2DF
          (plusminus:DF
            (vec_select:DF
              (match_operand:V2DF 1 "register_operand" "0,x")
              (parallel [(const_int 0)]))
            (match_operand:DF 2 "nonimmediate_operand" "xm,xm"))
          (vec_select:DF (match_dup 1) (parallel [(const_int 1)]))))]

--
Marc Glisse

Reply via email to