I'm doing some experiments to get to know GCC better, and something is puzzling
me.
I have defined an md file with DFA and costs describing the fact that loads
take a while (as do stores). Also, there is no memory to memory move, only
memory to/from register.
Test program is basically a=b; c=d; e=f; g=h;
Sched1, as expected, turns this into four loads followed by four stores,
exploiting the pipeline.
Then IRA kicks in. It shuffles the insns back into load/store, load/store
pairs, essentially the source code order. It looks like it's doing that to
reduce the number of registers used. Fair enough, but this makes the code less
efficient. I don't see a way to tell IRA not to do this.
As it happens, there's a secondary reload involved: the loads are into one set
of registers but the stores from another, so a register to register move is
added in by reload. Does that explain the behavior? I tried changing the
cover_classes, but that doesn't make a difference.
paul