Re: RFA: patch to prohibit IRA undoing sched1 [was IRA undoing sched1]

Jeff Law Fri, 03 Dec 2010 07:01:22 -0800

On 12/02/10 15:17, Vladimir Makarov wrote:

On 12/01/2010 02:14 PM, Paul Koning wrote:
On Nov 29, 2010, at 9:51 PM, Vladimir Makarov wrote:
On 11/29/2010 08:52 PM, Paul Koning wrote:
I'm doing some experiments to get to know GCC better, and somethingis puzzling me.
I have defined an md file with DFA and costs describing the factthat loads take a while (as do stores). Also, there is no memory tomemory move, only memory to/from register.
Test program is basically a=b; c=d; e=f; g=h;
Sched1, as expected, turns this into four loads followed by fourstores, exploiting the pipeline.
Then IRA kicks in. It shuffles the insns back into load/store,load/store pairs, essentially the source code order. It looks likeit's doing that to reduce the number of registers used. Fairenough, but this makes the code less efficient. I don't see a wayto tell IRA not to do this.
Most probably that happens because of ira.c::update_equiv_regs.This function was inherited from the old register allocator. Themajor goal of the function is to find equivalentmemory/constants/invariants for pseudos which can be used by reloadpass. Pseudo equivalence also affects live range splitting decisionin IRA.
Update_equiv_regs can also move insns initiating pseudo equivalencesclose to the pseudo usage. You could try to prevent this and to seewhat happens. IMO preventing such insn moving will do more harm onperformance on SPEC benchmarks for x86/x86-64 processors.
As it happens, there's a secondary reload involved: the loads areinto one set of registers but the stores from another, so aregister to register move is added in by reload. Does that explainthe behavior? I tried changing the cover_classes, but that doesn'tmake a difference.
It is hard to say without the dump file. If everything is correctlydefined, it should not happen.
I extended the test code a little, and fed it to a mips64el-elftargeted gcc. It showed the same pattern in one of the two functionsbut not the other. The test code is test8.c (attached).
What I see in the assembly output (test8.s, also attached) is thatfoo() has a load then store then load then store pattern, whichcontradicts what sched1 constructed and doesn't take advantage of thepipeline. However, bar() does use the pipeline. I don't know what'sdifferent between these two.
Do you want some dump file (which ones)? Or you could just reproducethis with the current gcc, it's a standard target build. The compilewas -O2 -mtune=mips64r2 -mabi=n32.
  As I guessed the problem is in update_reg_equiv transformation
trying to move initialization insn close to its single use to decrease
the register pressure.  A lot of people already complaint about
undoing scheduling by this function.

  The following patch solves the problem when you use
-fsched-pressure.  I would not like to do that for regular (not
register pressure-sensitive) insn scheduling for obvious reasons.

I think most RISC targets (including MIPS ones) should make
-fsched-pressure by default.


2010-12-02  Vladimir Makarov <[email protected]>

    * ira.c (update_equiv_regs): Prohibit move insns if
    pressure-sensitive scheduling was done.

Jeff, sorry for bothering you.  Is it ok to commit the patch to the
trunk?

It seems fairly reasonable to me, at least in the short term.

ISTM that longer term we'd want to do these transformations when we'reunable to allocate the affected pseudos to hard regs. ie, leave theschedule alone unless it results in an inability to get a reasonableallocation


jeff

Re: RFA: patch to prohibit IRA undoing sched1 [was IRA undoing sched1]

Reply via email to