On Sat, 16 Mar 2013, Eric Botcazou wrote:
My current understanding of simplify-rtx is that we should only do "safe"
optimizations in it (make sure we only create expressions that every
target will recognize), and if I want more advanced optimizations, I
should do them elsewhere (not sure where). So I should probably at least
restrict this one to the case where the result and XEXP (trueop0, 0) have
the same mode.
Yes, simplify-rtx should only canonicalize or simplify expressions.
Merging 2 VEC_SELECTs into a single one seems to fall into the second
category, but if no target can take advantage of it, that's probably a
bit questionable indeed.
Thank you for the comment.
I successfully regtested the attached trivial patch on x86_64-linux-gnu:
2013-03-16 Marc Glisse <marc.gli...@inria.fr>
* simplify-rtx.c (simplify_binary_operation_1) <VEC_CONCAT>:
Restrict the transformation to equal modes.
The original use case (and the testcase) are still handled. I don't think
any target supports size(output) > size(input) for vec_select, so we don't
lose anything. For size(output) < size(input), I can see one case in the
x86 backend that could have used it:
#include <x86intrin.h>
__v2sf f(__v4sf x){ return (__v2sf){x[0],x[1]}; }
which ideally would exactly match sse_storelps (extractps is a
vec_select:SF V4SF), but sadly it writes to memory instead of using a
vec_concat so it doesn't benefit from the simplify-rtx code. sse_storehps
is similar.
(tree optimizations should have merged the 2 BIT_FIELD_REFs earlier, but
that's a different issue)
Even with the restriction we can still generate arbitrary permutations,
which not all targets will recognize for every vector mode, but I guess
there will be time to think about it if someone comes up with a testcase
where this causes a problem.
Is the patch ok for trunk (once 4.8.0 is out and the new server up)?
--
Marc Glisse
Index: gcc/simplify-rtx.c
===================================================================
--- gcc/simplify-rtx.c (revision 196701)
+++ gcc/simplify-rtx.c (working copy)
@@ -3619,21 +3619,22 @@ simplify_binary_operation_1 (enum rtx_co
i - in_n_elts);
}
}
return gen_rtx_CONST_VECTOR (mode, v);
}
/* Try to merge VEC_SELECTs from the same vector into a single one. */
if (GET_CODE (trueop0) == VEC_SELECT
&& GET_CODE (trueop1) == VEC_SELECT
- && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop1, 0)))
+ && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop1, 0))
+ && GET_MODE (XEXP (trueop0, 0)) == mode)
{
rtx par0 = XEXP (trueop0, 1);
rtx par1 = XEXP (trueop1, 1);
int len0 = XVECLEN (par0, 0);
int len1 = XVECLEN (par1, 0);
rtvec vec = rtvec_alloc (len0 + len1);
for (int i = 0; i < len0; i++)
RTVEC_ELT (vec, i) = XVECEXP (par0, 0, i);
for (int i = 0; i < len1; i++)
RTVEC_ELT (vec, len0 + i) = XVECEXP (par1, 0, i);