On Thu, Dec 2, 2010 at 2:17 PM, Vladimir Makarov <vmaka...@redhat.com> wrote: > On 12/01/2010 02:14 PM, Paul Koning wrote: >> >> On Nov 29, 2010, at 9:51 PM, Vladimir Makarov wrote: >> >>> On 11/29/2010 08:52 PM, Paul Koning wrote: >>>> >>>> I'm doing some experiments to get to know GCC better, and something is >>>> puzzling me. >>>> >>>> I have defined an md file with DFA and costs describing the fact that >>>> loads take a while (as do stores). Also, there is no memory to memory move, >>>> only memory to/from register. >>>> >>>> Test program is basically a=b; c=d; e=f; g=h; >>>> >>>> Sched1, as expected, turns this into four loads followed by four stores, >>>> exploiting the pipeline. >>>> >>>> Then IRA kicks in. It shuffles the insns back into load/store, >>>> load/store pairs, essentially the source code order. It looks like it's >>>> doing that to reduce the number of registers used. Fair enough, but this >>>> makes the code less efficient. I don't see a way to tell IRA not to do >>>> this. >>>> >>> Most probably that happens because of ira.c::update_equiv_regs. This >>> function was inherited from the old register allocator. The major goal of >>> the function is to find equivalent memory/constants/invariants for pseudos >>> which can be used by reload pass. Pseudo equivalence also affects live >>> range splitting decision in IRA. >>> >>> Update_equiv_regs can also move insns initiating pseudo equivalences >>> close to the pseudo usage. You could try to prevent this and to see what >>> happens. IMO preventing such insn moving will do more harm on performance >>> on SPEC benchmarks for x86/x86-64 processors. >>>> >>>> As it happens, there's a secondary reload involved: the loads are into >>>> one set of registers but the stores from another, so a register to register >>>> move is added in by reload. Does that explain the behavior? I tried >>>> changing the cover_classes, but that doesn't make a difference. >>>> >>> It is hard to say without the dump file. If everything is correctly >>> defined, it should not happen. >>> >> I extended the test code a little, and fed it to a mips64el-elf targeted >> gcc. It showed the same pattern in one of the two functions but not the >> other. The test code is test8.c (attached). >> >> What I see in the assembly output (test8.s, also attached) is that foo() >> has a load then store then load then store pattern, which contradicts what >> sched1 constructed and doesn't take advantage of the pipeline. However, >> bar() does use the pipeline. I don't know what's different between these >> two. >> >> Do you want some dump file (which ones)? Or you could just reproduce this >> with the current gcc, it's a standard target build. The compile was -O2 >> -mtune=mips64r2 -mabi=n32. >> > As I guessed the problem is in update_reg_equiv transformation > trying to move initialization insn close to its single use to decrease > the register pressure. A lot of people already complaint about > undoing scheduling by this function. > > The following patch solves the problem when you use > -fsched-pressure. I would not like to do that for regular (not > register pressure-sensitive) insn scheduling for obvious reasons. > > I think most RISC targets (including MIPS ones) should make > -fsched-pressure by default. > > > 2010-12-02 Vladimir Makarov <vmaka...@redhat.com> > > * ira.c (update_equiv_regs): Prohibit move insns if > pressure-sensitive scheduling was done. > > Jeff, sorry for bothering you. Is it ok to commit the patch to the > trunk? >
This caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46880 -- H.J.