On Sat, 16 Mar 2013, Eric Botcazou wrote:

My current understanding of simplify-rtx is that we should only do "safe"
optimizations in it (make sure we only create expressions that every
target will recognize), and if I want more advanced optimizations, I
should do them elsewhere (not sure where). So I should probably at least
restrict this one to the case where the result and XEXP (trueop0, 0) have
the same mode.

Yes, simplify-rtx should only canonicalize or simplify expressions. Merging 2 VEC_SELECTs into a single one seems to fall into the second category, but if no target can take advantage of it, that's probably a bit questionable indeed.

Thank you for the comment.

I successfully regtested the attached trivial patch on x86_64-linux-gnu:

2013-03-16  Marc Glisse  <marc.gli...@inria.fr>

        * simplify-rtx.c (simplify_binary_operation_1) <VEC_CONCAT>:
        Restrict the transformation to equal modes.


The original use case (and the testcase) are still handled. I don't think any target supports size(output) > size(input) for vec_select, so we don't lose anything. For size(output) < size(input), I can see one case in the x86 backend that could have used it:

#include <x86intrin.h>
__v2sf f(__v4sf x){ return (__v2sf){x[0],x[1]}; }

which ideally would exactly match sse_storelps (extractps is a vec_select:SF V4SF), but sadly it writes to memory instead of using a vec_concat so it doesn't benefit from the simplify-rtx code. sse_storehps is similar. (tree optimizations should have merged the 2 BIT_FIELD_REFs earlier, but that's a different issue)


Even with the restriction we can still generate arbitrary permutations, which not all targets will recognize for every vector mode, but I guess there will be time to think about it if someone comes up with a testcase where this causes a problem.

Is the patch ok for trunk (once 4.8.0 is out and the new server up)?

--
Marc Glisse
Index: gcc/simplify-rtx.c
===================================================================
--- gcc/simplify-rtx.c  (revision 196701)
+++ gcc/simplify-rtx.c  (working copy)
@@ -3619,21 +3619,22 @@ simplify_binary_operation_1 (enum rtx_co
                                                           i - in_n_elts);
                  }
              }
 
            return gen_rtx_CONST_VECTOR (mode, v);
          }
 
        /* Try to merge VEC_SELECTs from the same vector into a single one.  */
        if (GET_CODE (trueop0) == VEC_SELECT
            && GET_CODE (trueop1) == VEC_SELECT
-           && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop1, 0)))
+           && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop1, 0))
+           && GET_MODE (XEXP (trueop0, 0)) == mode)
          {
            rtx par0 = XEXP (trueop0, 1);
            rtx par1 = XEXP (trueop1, 1);
            int len0 = XVECLEN (par0, 0);
            int len1 = XVECLEN (par1, 0);
            rtvec vec = rtvec_alloc (len0 + len1);
            for (int i = 0; i < len0; i++)
              RTVEC_ELT (vec, i) = XVECEXP (par0, 0, i);
            for (int i = 0; i < len1; i++)
              RTVEC_ELT (vec, len0 + i) = XVECEXP (par1, 0, i);

Reply via email to