http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56766



             Bug #: 56766

           Summary: Fails to combine (vec_select (vec_concat ...)) to

                    (vec_merge ...)

    Classification: Unclassified

           Product: gcc

           Version: 4.9.0

            Status: UNCONFIRMED

          Keywords: missed-optimization

          Severity: normal

          Priority: P3

         Component: rtl-optimization

        AssignedTo: unassig...@gcc.gnu.org

        ReportedBy: rgue...@gcc.gnu.org

                CC: r...@gcc.gnu.org

            Target: x86_64-*-*





With a patch to vectorize the pattern that should lead to the use of



(define_insn "sse3_addsubv2df3"

  [(set (match_operand:V2DF 0 "register_operand" "=x,x")

        (vec_merge:V2DF

          (plus:V2DF

            (match_operand:V2DF 1 "register_operand" "0,x")

            (match_operand:V2DF 2 "nonimmediate_operand" "xm,xm"))

          (minus:V2DF (match_dup 1) (match_dup 2))

          (const_int 2)))]

  "TARGET_SSE3"



this instruction fails to be generated because the GIMPLE



  vect_var_.9_15 = vect_var_.5_22 + vect_var_.8_18;

  vect_var_.10_14 = vect_var_.5_22 - vect_var_.8_18;

  _2 = VEC_PERM_EXPR <vect_var_.9_15, vect_var_.10_14, { 0, 3 }>;



is expanded to



(insn 24 23 25 (set (reg:V2DF 80 [ vect_var_.9 ])

        (plus:V2DF (reg:V2DF 76 [ vect_var_.5 ])

            (reg:V2DF 75 [ vect_var_.8 ]))) t.c:7 -1

     (nil))



(insn 25 24 27 (set (reg:V2DF 81 [ vect_var_.10 ])

        (minus:V2DF (reg:V2DF 76 [ vect_var_.5 ])

            (reg:V2DF 75 [ vect_var_.8 ]))) t.c:7 -1

     (nil))



(insn 27 25 28 (set (reg:V2DF 82 [ D.1768 ])

        (vec_select:V2DF (vec_concat:V4DF (reg:V2DF 80 [ vect_var_.9 ])

                (reg:V2DF 81 [ vect_var_.10 ]))

            (parallel [

                    (const_int 0 [0])

                    (const_int 3 [0x3])

                ]))) t.c:7 -1

     (nil))



which does not match the pattern in the i386 backend.



The question is what should be the canonical form?  Definitely vec_merge

is redundant and can always be replaced with (vec_select (vec_concat ...)).



Testcase w/o my vectorizer hack (compile with -O -msse3):



typedef double v2df __attribute__((vector_size(16)));

typedef long long v2di __attribute__((vector_size(16)));

v2df foo (v2df x, v2df y)

{

  v2df tem1 = x + y;

  v2df tem2 = x - y;

  return __builtin_shuffle (tem1, tem2, (v2di) { 0, 3 });

}



VEC_MERGE is not used very often ...

Reply via email to