http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54400

--- Comment #1 from Marc Glisse <glisse at gcc dot gnu.org> 2012-09-01 09:40:14 
UTC ---
The code below seems to optimize v[0]-v[1] and v[1]+v[0]. It doesn't recognize
v[0]+v[1], but that would not be too hard to add I guess. Compared to the true
hadd insn, I removed the setattr "type" "sseadd" because it crashed the
compiler (in cost computation maybe). Apart from the things left in here that
may not make sense, I don't know if a peephole would be more relevant. Maybe
the insn helps more if I want to recognize dot products (dppd) later on? At
least thanks to it {v[0]-v[1],w[0]-w[1]} is now recognized as a hsub (although
it doesn't work if v==w because vec_duplicate doesn't match vec_concat).

(define_insn "*sse3_h<plusminus_insn>v2df3_low_MARC"
  [(set (match_operand:DF 0 "register_operand" "=x,x")
        (plusminus:DF
          (vec_select:DF
            (match_operand:V2DF 1 "register_operand" "0,x")
            (parallel [(const_int 0)]))
          (vec_select:DF
            (match_dup 1)
            (parallel [(const_int 1)]))))]
  "TARGET_SSE3"
  "@
   h<plusminus_mnemonic>pd\t{%0, %0|%0, %0}
   vh<plusminus_mnemonic>pd\t{%1, %1, %0|%0, %1, %1}"
  [(set_attr "isa" "noavx,avx")
   (set_attr "prefix" "orig,vex")
   (set_attr "mode" "V2DF")])

Reply via email to