On Oct 10, 2014, at 8:32 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: > >>> Though I guess if we run fusion + peep2 between sched1 and sched2, that >>> problem would just resolve itself as we'd have fused AB together into a new >>> insn and we'd schedule normally with the fused insns and X, Y. >> >> Yes, in my version, I ran it really early, before sched. I needed to run >> before ira and run before other people > > Two reasons why I run it late in compilation process. > 1) IRA is the pass I tend not to disturb since code is changed > dramatically. With it after IRA, I can get certain improvement from > fusion, there is less noise here.
Since I have a front-end background, I think nothing of creating pseudos when I want to, I just know if I do, I have to do this before allocation. :-) For my peepholes, since they create registers, they must run before allocation. > 2) The spilling generates many load/store pair opportunities on ARM, > which I don't want to miss. I happen to have enough registers that spilling wasn’t my primary concern. > add rx, ry, rz > ldr r1, [rx] > ldr r2, [rx+4] > ldr r3, [rx+8] > > It will be transformed into: > > add rx, ry, rz > ldr r1, [ry+rz] > ldr r2, [rx+4] > ldr r3, [rx+8] Yeah, that seems to tickle a neuron. >> On the other hand, if you have >> >> left >> left >> right >> right >> >> There is no way to sort them to get: >> >> left >> right >> left >> right >> >> and then fuse: >> >> left_right >> left_right >> >> This would be impossible. > > I can't understand this very well, this exactly is one case we want to > fuse on ARM for movw/movt. Given > moww r1, const_1 This differs from the above by having r1 and const_1, in my example, there is no r1 and no const_1, this matters. I wanted to list a case where it is impossible to sort. This happens when there isn’t enough data to sort on, for example, no offset, no register number.