Shobaki, Ghassan wrote:
Hi,
I am considering working on fixing the pre-pass scheduling problem on
x86 (Bug 38403). The pre-pass instruction scheduler currently increases
register pressure to a degree that causes the register allocator to
fail. Before I commit to this task, I would like to gather as much
information as possible about the problem. What does the GCC community
know about the complexity of this problem? Has anyone ever worked on
fixing it? If yes, did they come up with any interesting findings that I
should be aware of?
In theory, the problem can be fixed by adding a heuristic that keeps
track of register pressure during scheduling and favors the scheduling
decisions that lead to less register pressure. I have done some initial
investigation and found that the scheduler already has a heuristic that
attempts to do that, but apparently this heuristic does not work as
expected. The code I am referring to is in the function
rank_for_schedule in haifa-sched.c. I have modified the code in that
function to make this register-weight heuristic the top-priority
heuristic but that did not make a big difference (the number of passing
CPU2006 benchmarks only went up from 5 to 6 out of 29 benchmarks).
That is a wrong heuristic because it incorrectly calculates the insn
impact on register pressure. For example, we have
1. use a
2. use a, dead a
The heuristic believes that the 2nd insn decreases register pressure and
prefers #2 to #1 and that is wrong because in this case a is not
becoming dead.
Although this heuristic was reported to be working well for powerpc and
therefore it was added to GCC.
So, my current hypothesis is that the problem can be fixed by developing
a more effective register pressure heuristic. To keep the problem
simpler, I plan on initially limiting my work to basic-block scheduling
(i.e., setting the maximum number of basic blocks in a scheduling region
to 1). We have reasons to believe that even with this limitation,
pre-pass scheduling can give 1-2% performance improvement on x86.
Any known information about this problem?
I've been working on register-pressure sensitive insn scheduling last
two months and I hope to submit this work for gcc4.5. I am implementing
also a mode in insn-scheduler to do only live range shrinkage.
Implementing register pressure insn scheduling is not easy. I already
mentioned one issue above about dead notes. Another issue is necessity
to simultaniously work with ready and queue (not only with ready) to
choose insn to schedule. A lot of communication with IRA should be
added too.
The bug itself is also a complicated issue which needs a good knowledge
of insn scheduler, RA, and even reload to fix this. The most bug cases
are like
1. insn containing pseudos
2. ax<- something
The 2nd insn is moved before the 1st one and the first insn needs ax
(e.g. it is a division).
The reload problem is more complicated. For example, the 1st insn has
only one alternative requiring ax
1st operand: =a, r, ...
2nd operand: 0, 0, ...
if both operands (pseudos) got memory, than cost of reloading for the
1st and 2nd alternative is the same. Reload can choose the 1st
alternative and we have the same problem as above.