On Oct 10, 2014, at 8:32 PM, Bin.Cheng <amker.ch...@gmail.com> wrote:
> 
>>> Though I guess if we run fusion + peep2 between sched1 and sched2, that 
>>> problem would just resolve itself as we'd have fused AB together into a new 
>>> insn and we'd schedule normally with the fused insns and X, Y.
>> 
>> Yes, in my version, I ran it really early, before sched.  I needed to run 
>> before ira and run before other people
> 
> Two reasons why I run it late in compilation process.
> 1) IRA is the pass I tend not to disturb since code is changed
> dramatically.  With it after IRA, I can get certain improvement from
> fusion, there is less noise here.

Since I have a front-end background, I think nothing of creating pseudos when I 
want to, I just know if I do, I have to do this before allocation.  :-)  For my 
peepholes, since they create registers, they must run before allocation.

> 2) The spilling generates many load/store pair opportunities on ARM,
> which I don't want to miss.

I happen to have enough registers that spilling wasn’t my primary concern.

> add rx, ry, rz
> ldr   r1, [rx]
> ldr   r2, [rx+4]
> ldr   r3, [rx+8]
> 
> It will be transformed into:
> 
> add rx, ry, rz
> ldr   r1, [ry+rz]
> ldr   r2, [rx+4]
> ldr   r3, [rx+8]

Yeah, that seems to tickle a neuron.

>> On the other hand, if you have
>> 
>> left
>> left
>> right
>> right
>> 
>> There is no way to sort them to get:
>> 
>> left
>> right
>> left
>> right
>> 
>> and then fuse:
>> 
>> left_right
>> left_right
>> 
>> This would be impossible.
> 
> I can't understand this very well, this exactly is one case we want to
> fuse on ARM for movw/movt.  Given
> moww  r1, const_1

This differs from the above by having r1 and const_1, in my example, there is 
no r1 and no const_1, this matters.  I wanted to list a case where it is 
impossible to sort.  This happens when there isn’t enough data to sort on, for 
example, no offset, no register number.

Reply via email to