RE: Fixing the pre-pass scheduler on x86 (Bug 38403)

Shobaki, Ghassan Thu, 09 Apr 2009 14:28:30 -0700

Thanks for everyone who has responded to this email thread! 

That was quite informative for me although I am not yet familiar with
all the phases in GCC. I have worked on instruction scheduling and
register allocation in another compiler and am currently trying to
understand how things are done in GCC. Can you guys kindly give me some
references or brief descriptions of the following:

1. Selective scheduling. I am not familiar with the term. Is this a
trace scheduler of some sort?
2. Reload. Is this another name for the register allocation phase or
something else? 
3. Register allocation in GCC. Is it based on a graph-coloring algorithm
or a linear-scan algorithm or something else?

It looks that the answer to my question about fixing pre-pass scheduling
by adding register pressure awareness is that Vladimir is currently
working on it. Vladimir, can you please give us some brief description
of the approach that you are taking to solve this problem? If you can
refer me to some document that you have written up on this, that would
be great. Balancing the tradeoff between hiding latencies and minimizing
register pressure (shortening live ranges) is a very complex problem.
Ideally, a precise solution requires some kind of backtracking to undo
bad scheduling decisions. However, I think it may be possible to develop
a one-pass greedy algorithm that gives a good approximation in most
cases. My thinking is that an such an approximate greedy algorithm
(i.e., without backtracking) will probably do something like this:

The scheduler keeps track of the register pressure and then for each
instruction in the ready list, it computes what would the register
pressure be if this instruction was scheduled next. Based on that, it
selects the instruction that leads to the minimum register pressure.
Computing register pressure after the scheduling of an instruction
requires keeping track of the number of uses of each live range. If an
instruction has the last use of a live range, that should give it
priority over other instructions. This is very much what the existing
heuristic tries to do, but the existing heuristic is static. It does not
dynamically keep track of register pressure and the defs and uses of
each live range. 

How does this relate to the solution that you are currently working on,
Vladimir?
Do you know when you will have something working? Do you have any
initial performance results?

Thanks
-Ghassan

-----Original Message-----
From: Andrey Belevantsev [mailto:a...@ispras.ru] 
Sent: Wednesday, April 08, 2009 9:25 AM
To: Vladimir Makarov
Cc: Steven Bosscher; Shobaki, Ghassan; gcc@gcc.gnu.org
Subject: Re: Fixing the pre-pass scheduler on x86 (Bug 38403)

Vladimir Makarov wrote:
> Steven Bosscher wrote:
>> On Wed, Apr 8, 2009 at 5:19 AM, Vladimir Makarov
<vmaka...@redhat.com> 
>> wrote:
>>> I've been working on register-pressure sensitive insn scheduling
last 
>>> two
>>> months and I hope to submit this work for gcc4.5.  I am implementing

>>> also a
>>> mode in insn-scheduler to do only live range shrinkage.
>>
>> Is all of this still necessary if the selective scheduler (with
>> register renaming) is made to work on i686/x86_64 after reload?
>>
> 
>  That is a really interesting question, Steven.  I thought about
> this for a few months (since last fall).  Here is my points to result
> me in starting my work on haifa-scheduler:
> 
>  1. Selective scheduler works only for Itanium now.  As I know there
>     are some plans to make it working on PPC, but there are no plans
>     to make it working for other acrhictectures for now.
There are some patches for fixing sel-sched on PPC that I need to ping, 
thanks for reminding me :)  They were too late for regression-only 
mode, and we didn't get access to modern PPCs as we hoped, so this was 
not an issue earlier.  Btw, when we've submitted the scheduler, it did 
work on x86-64 (compile farm), I don't know whether this is still the
case.

> 
>  2. My understanding is that (register-pressure sensitive insn
>     scheduling + RA + 2nd insn scheduling) is not equal to (RA +
>     selective scheduling with register renaming) with the point of
>     the potential performance results.  In first case, the
>     pressure-sensitive insn scheduler could improve RA by live-range
>     shrinkage.  In the 2nd case it is already impossible.  It could
>     be improved by the 2nd RA but RA is more time consuming than
>     scheduling now.  In general this chicken-egg problem could be
>     solved by iterating the 2 contradictory passes insn scheduling +
>     RA (or RA + insns scheduling).  It is a matter of practicality
>     when to stop these iterations.
> 
>  3. My current understanding is that selective scheduler is overkill
>     for architectures with few registers.  In other words, I don't
>     think it will give a better performance for such architectures.
>     On the other hand, it is much slower than haifa-scheduler because
>     of its complexity.  Haifa-scheduler before RA already adds 3-5%
compile
>     time, selective scheduler will add 5-7% more compile time without
>     performance improvement for mainstream architectures x86/x86_64.
>     I think it is intolerable.
We still plan to do some speedups to the sel-sched code within 4.5 
timeframe, mainly to the dependence handling code.  But even after that,

I agree that for out-of-order architectures selective scheduler will 
probably be an overkill.  Besides, register renaming itself would remain

an expensive operation, because it needs to scan all paths along which 
an insn was moved to prove that the new register can be used, though we 
have invested a lot of time for speeding up this process via various 
caching mechanisms.  On ia64, register renaming is useful for pipelining

loops.

Also, we have tried to limit register renaming for the 1st pass 
selective scheduler via tracking register pressure and having a cutoff 
for that, but it didn't work out very well on ia64, so I agree that much

more of RA knowledge should be brought in for this task.

Hope this helps.  Vlad, Steven, thanks for caring.

Andrey

RE: Fixing the pre-pass scheduler on x86 (Bug 38403)

Reply via email to