On 15/04/15 19:25, James Greenhalgh wrote: > On Tue, Apr 14, 2015 at 11:08:55PM +0100, Kugan wrote: >> Now that Stage1 is open, is this OK for trunk. > > Hi Kugan, > >> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c >> index cba3c1a..d6ad0af 100644 >> --- a/gcc/config/aarch64/aarch64.c >> +++ b/gcc/config/aarch64/aarch64.c >> @@ -5544,10 +5544,17 @@ aarch64_rtx_costs (rtx x, int code, int outer >> ATTRIBUTE_UNUSED, >> >> /* Fall through. */ >> case REG: >> + if (VECTOR_MODE_P (GET_MODE (op0)) && REG_P (op1)) >> + { >> + /* The cost is 1 per vector-register copied. */ >> + int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1) >> + / GET_MODE_SIZE (V4SImode); >> + *cost = COSTS_N_INSNS (n_minus_1 + 1); >> + } >> /* const0_rtx is in general free, but we will use an >> instruction to set a register to 0. */ >> - if (REG_P (op1) || op1 == const0_rtx) >> - { >> + else if (REG_P (op1) || op1 == const0_rtx) >> + { >> /* The cost is 1 per register copied. */ >> int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1) >> / UNITS_PER_WORD; > > I would not have expected control flow to reach this point, as we have:
It does for mode == VODmode. RTL X is for example: (set (reg:V8DI 67 virtual-incoming-args) (reg:V8DI 68 virtual-stack-vars)) > >> /* TODO: The cost infrastructure currently does not handle >> vector operations. Assume that all vector operations >> are equally expensive. */ >> if (VECTOR_MODE_P (mode)) >> { >> if (speed) >> *cost += extra_cost->vect.alu; >> return true; >> } > > But, I see that this check is broken for a set RTX (which has no mode). > So, your patch works, but only due to a bug in my original implementation. > This leaves the code with quite a messy design. > > There are two ways I see that we could clean things up, both of which > require some reworking of your patch. > > Either we remove my check above and teach the RTX costs how to properly > cost vector operations, or we fix my check to catch all vector RTX > and add the special cases for the small subset of things we understand > up there. > > The correct approach in the long term is to fix the RTX costs to correctly > understand vector operations, so I'd much prefer to see a patch along > these lines, though I appreciate that is a substantially more invasive > piece of work. > I agree that rtx cost for vector is not handled right now. We might not be able to completely separate as Kyrill suggested. We still need the vector SET with VOIDmode to be handled inline. This patch is that part. We can work on the others as a separate function, if you prefer that. I am happy to look this as a separate patch. Thanks, Kugan