On Sun, 30 Sep 2012, Marc Glisse wrote:

On Sat, 29 Sep 2012, Eric Botcazou wrote:

this patch lets the compiler try to rewrite:

(vec_concat (vec_select x [a]) (vec_select x [b]))

as:

vec_select x [a b]

or even just "x" if appropriate.
[...]
Why not generalizing to all kinds of VEC_SELECTs instead of just scalar ones?

Ok, I changed the patch a bit to handle arbitrary VEC_SELECTs, and moved the identity recognition to VEC_SELECT handling (where it belonged). Testing with non-scalar VEC_SELECTs was limited though, because they are not that easy to generate. Also, the identity case is the only one where it actually optimized. To handle more cases, I'd have to look through several layers of VEC_SELECTs, which gets a bit complicated (for instance, the permutation 0,1,3,2 will appear as a vec_concat of a vec_select(v,[0,1]) and a vec_select(vec_select(v,[2,3]),[1,0]), or worse with a vec_concat in the middle). It also didn't optimize 3,2,3,2, possibly because that meant substituting the same rtx twice (I didn't go that far in gdb). Then there is also the vec_duplicate case (I should try to replace vec_duplicate with vec_concat in simplify-rtx to see what happens...). Still, the identity case is nice to have.

I think I may have been a bit too hasty removing the restriction to the scalar case (and even that version was possibly wrong for some targets). For instance, if I have a vec_concat of 2 permutations of a vector, this will generate a vec_select with a result twice the size of its input, which will likely fail to be recognized. For now I have only managed to reach this situation in combine, where the unrecognizable expression is simply ignored. But I think simplify-rtx is used in less forgiving places (and even in combine it could cause optimizations to be missed).

My current understanding of simplify-rtx is that we should only do "safe" optimizations in it (make sure we only create expressions that every target will recognize), and if I want more advanced optimizations, I should do them elsewhere (not sure where). So I should probably at least restrict this one to the case where the result and XEXP (trueop0, 0) have the same mode.

Does that make sense? Or is the current code ok and I am worrying for nothing?

For reference, the conversation started here:
http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00540.html
and I include a copy of the relevant part of the patch that was committed:

2012-09-09  Marc Glisse  <marc.gli...@inria.fr>

gcc/
        * simplify-rtx.c (simplify_binary_operation_1)
        <VEC_CONCAT>: Handle VEC_SELECTs from the same vector.

+       /* Try to merge VEC_SELECTs from the same vector into a single one.  */
+       if (GET_CODE (trueop0) == VEC_SELECT
+           && GET_CODE (trueop1) == VEC_SELECT
+           && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop1, 0)))
+         {
+           rtx par0 = XEXP (trueop0, 1);
+           rtx par1 = XEXP (trueop1, 1);
+           int len0 = XVECLEN (par0, 0);
+           int len1 = XVECLEN (par1, 0);
+           rtvec vec = rtvec_alloc (len0 + len1);
+           for (int i = 0; i < len0; i++)
+             RTVEC_ELT (vec, i) = XVECEXP (par0, 0, i);
+           for (int i = 0; i < len1; i++)
+             RTVEC_ELT (vec, len0 + i) = XVECEXP (par1, 0, i);
+           return simplify_gen_binary (VEC_SELECT, mode, XEXP (trueop0, 0),
+                                       gen_rtx_PARALLEL (VOIDmode, vec));
+         }


--
Marc Glisse

Reply via email to