On 11/29/2010 08:52 PM, Paul Koning wrote:
I'm doing some experiments to get to know GCC better, and something is puzzling
me.
I have defined an md file with DFA and costs describing the fact that loads
take a while (as do stores). Also, there is no memory to memory move, only
memory to/from register.
Test program is basically a=b; c=d; e=f; g=h;
Sched1, as expected, turns this into four loads followed by four stores,
exploiting the pipeline.
Then IRA kicks in. It shuffles the insns back into load/store, load/store
pairs, essentially the source code order. It looks like it's doing that to
reduce the number of registers used. Fair enough, but this makes the code less
efficient. I don't see a way to tell IRA not to do this.
Most probably that happens because of ira.c::update_equiv_regs. This
function was inherited from the old register allocator. The major goal
of the function is to find equivalent memory/constants/invariants for
pseudos which can be used by reload pass. Pseudo equivalence also
affects live range splitting decision in IRA.
Update_equiv_regs can also move insns initiating pseudo equivalences
close to the pseudo usage. You could try to prevent this and to see
what happens. IMO preventing such insn moving will do more harm on
performance on SPEC benchmarks for x86/x86-64 processors.
As it happens, there's a secondary reload involved: the loads are into one set
of registers but the stores from another, so a register to register move is
added in by reload. Does that explain the behavior? I tried changing the
cover_classes, but that doesn't make a difference.
It is hard to say without the dump file. If everything is correctly
defined, it should not happen.