On Tue, Apr 14, 2015 at 11:08:55PM +0100, Kugan wrote: > Now that Stage1 is open, is this OK for trunk.
Hi Kugan, > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index cba3c1a..d6ad0af 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -5544,10 +5544,17 @@ aarch64_rtx_costs (rtx x, int code, int outer > ATTRIBUTE_UNUSED, > > /* Fall through. */ > case REG: > + if (VECTOR_MODE_P (GET_MODE (op0)) && REG_P (op1)) > + { > + /* The cost is 1 per vector-register copied. */ > + int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1) > + / GET_MODE_SIZE (V4SImode); > + *cost = COSTS_N_INSNS (n_minus_1 + 1); > + } > /* const0_rtx is in general free, but we will use an > instruction to set a register to 0. */ > - if (REG_P (op1) || op1 == const0_rtx) > - { > + else if (REG_P (op1) || op1 == const0_rtx) > + { > /* The cost is 1 per register copied. */ > int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1) > / UNITS_PER_WORD; I would not have expected control flow to reach this point, as we have: > /* TODO: The cost infrastructure currently does not handle > vector operations. Assume that all vector operations > are equally expensive. */ > if (VECTOR_MODE_P (mode)) > { > if (speed) > *cost += extra_cost->vect.alu; > return true; > } But, I see that this check is broken for a set RTX (which has no mode). So, your patch works, but only due to a bug in my original implementation. This leaves the code with quite a messy design. There are two ways I see that we could clean things up, both of which require some reworking of your patch. Either we remove my check above and teach the RTX costs how to properly cost vector operations, or we fix my check to catch all vector RTX and add the special cases for the small subset of things we understand up there. The correct approach in the long term is to fix the RTX costs to correctly understand vector operations, so I'd much prefer to see a patch along these lines, though I appreciate that is a substantially more invasive piece of work. Thanks, James