Re: fwprop and CSE const anchor opt

Adam Nemet Wed, 08 Apr 2009 00:05:53 -0700

Thank you very much.  This was very informative.

Richard Sandiford writes:
> If we have an instruction:
> 
>     A: (set (reg Z) (plus (reg X) (const_int 0xdeadbeef)))
> 
> we will need to use something like:
> 
>        (set (reg Y) (const_int 0xdead0000))
>        (set (reg Y) (ior (reg Y) (const_int 0xbeef)))
>     B: (set (reg Z) (plus (reg X) (reg Y)))
> 
> But if A is in a loop, the Y loads can be hoisted, and the cost
> of A is effectively the same as the cost of B.  In other words,
> the (il)legitimacy of the constant operand doesn't really matter.


My guess is that A not being a recognizable insn, this is relevant at RTL
expansion.  Is this correct?

> In summary, the current costs generally work because:
> 
>   (a) We _usually_ only apply costs to arbitrary instructions
>       (rather than candidate instruction patterns) before
>       loop optimisation.

I don't think I understand this point.  I see the part that the cost is
typically queried before loop optimization but I don't understand the
distinction between "arbitrary instructions" and "candidate instruction
patterns".  Can you please explain the difference?

>   (b) It doesn't matter what we return for invalid candidate
>       instruction patterns, because recog will reject them anyway.
> 
> So I suppose my next question is: are you seeing this problem with cse1
> or cse2?  The reasoning behind the zero cost might still be valid for
> REG_EQUAL notes in cse1.  However, it's probably not right for cse2,
> which runs after loop hoisting.

I am seeing it with both, so at least at cse2 we could do it with this.

> Perhaps we could add some kind of context parameter to rtx_costs
> to choose between the hoisting and non-hoisting cost.  As well as
> helping with your case, it could let us use the non-hoisting cost
> before loop optimisation in cases where the insn isn't going to
> go in a loop.  The drawback is that we then have to replicate
> even more of the .md file in rtx_costs.
> 
> Alternatively, perhaps we could just assume that rtx_costs always
> returns the hoisted cost when optimising for speed, in which case
> I think your alternative solution would be theoretically correct
> (i.e. not a hack ;)).

OK, I think I am going to propose this in the patch then.  It might still be
interesting to experiment with providing more context to rtx_costs.

> E.g. suppose we're deciding how to implement an in-loop multiplication.
> We calculate the cost of a multiplication instruction vs. the cost of a
> shift/add sequence, but we don't consider whether any of the backend-specific
> shift/add set-up instructions could be hoisted.  This would lead to us
> using multiplication insns in cases where we don't want to.
> 
> (This was one of the most common situations in which the zero cost helped.)

I am not sure I understand this.  Why would we decide to hoist suboperations
of a multiplication?  If it is loop-variant then even the suboperations are
loop-variant whereas if it is loop-invariant then we can hoist the whole
operation.  What am I missing?

Adam

Re: fwprop and CSE const anchor opt

Reply via email to