On 12/01/2010 02:14 PM, Paul Koning wrote:
On Nov 29, 2010, at 9:51 PM, Vladimir Makarov wrote:

On 11/29/2010 08:52 PM, Paul Koning wrote:
I'm doing some experiments to get to know GCC better, and something is puzzling 
me.

I have defined an md file with DFA and costs describing the fact that loads 
take a while (as do stores). Also, there is no memory to memory move, only 
memory to/from register.

Test program is basically a=b; c=d; e=f; g=h;

Sched1, as expected, turns this into four loads followed by four stores, 
exploiting the pipeline.

Then IRA kicks in.  It shuffles the insns back into load/store, load/store 
pairs, essentially the source code order.  It looks like it's doing that to 
reduce the number of registers used.  Fair enough, but this makes the code less 
efficient.  I don't see a way to tell IRA not to do this.

Most probably that happens because of ira.c::update_equiv_regs.   This function 
was inherited from the old register allocator.  The major goal of the function 
is to find equivalent memory/constants/invariants for pseudos which can be used 
by reload pass.  Pseudo equivalence also affects live range splitting decision 
in IRA.

Update_equiv_regs can also move insns initiating pseudo equivalences close to 
the pseudo usage.  You could try to prevent this and to see what happens.  IMO 
preventing such insn moving will do more harm on performance on SPEC benchmarks 
for x86/x86-64 processors.
As it happens, there's a secondary reload involved: the loads are into one set 
of registers but the stores from another, so a register to register move is 
added in by reload.  Does that explain the behavior?  I tried changing the 
cover_classes, but that doesn't make a difference.

It is hard to say without the dump file.  If everything is correctly defined, 
it should not happen.

I extended the test code a little, and fed it to a mips64el-elf targeted gcc.  
It showed the same pattern in one of the two functions but not the other.  The 
test code is test8.c (attached).

What I see in the assembly output (test8.s, also attached) is that foo() has a 
load then store then load then store pattern, which contradicts what sched1 
constructed and doesn't take advantage of the pipeline.  However, bar() does 
use the pipeline.  I don't know what's different between these two.

Do you want some dump file (which ones)?  Or you could just reproduce this with 
the current gcc, it's a standard target build.  The compile was -O2 
-mtune=mips64r2 -mabi=n32.

  As I guessed the problem is in update_reg_equiv transformation
trying to move initialization insn close to its single use to decrease
the register pressure.  A lot of people already complaint about
undoing scheduling by this function.

  The following patch solves the problem when you use
-fsched-pressure.  I would not like to do that for regular (not
register pressure-sensitive) insn scheduling for obvious reasons.

I think most RISC targets (including MIPS ones) should make
-fsched-pressure by default.


2010-12-02  Vladimir Makarov <vmaka...@redhat.com>

    * ira.c (update_equiv_regs): Prohibit move insns if
    pressure-sensitive scheduling was done.

Jeff, sorry for bothering you.  Is it ok to commit the patch to the
trunk?

Index: ira.c
===================================================================
--- ira.c       (revision 167373)
+++ ira.c       (working copy)
@@ -2585,7 +2585,13 @@ update_equiv_regs (void)
                  rtx equiv_insn;
 
                  if (! reg_equiv[regno].replace
-                     || reg_equiv[regno].loop_depth < loop_depth)
+                     || reg_equiv[regno].loop_depth < loop_depth
+                     /* There is no sense to move insns if we did
+                        register pressure-sensitive scheduling was
+                        done because it will not improve allocation
+                        but worsen insn schedule with a big
+                        probability.  */
+                     || (flag_sched_pressure && flag_schedule_insns))
                    continue;
 
                  /* reg_equiv[REGNO].replace gets set only when

Reply via email to