Hi Vladimir,
Thanks for your kind and very detailed response!
在 2017年12月15日 12:40, Vladimir Makarov 写道:
On 12/14/2017 10:18 PM, Leslie Zhai wrote:
Hi GCC and LLVM developers,
I am learning Register Allocation algorithms and I am clear that:
* Unlimited VirtReg (pseudo) -> limited or fixed or alias[1] PhysReg
(hard)
* Memory (20 - 100 cycles) is expensive than Register (1 cycle), but
it has to spill code when PhysReg is unavailable
It might be much less if memory value is in L1 cache.
* Folding spill code into instructions, handling register
coallescing, splitting live ranges, doing rematerialization, doing
shrink wrapping are harder than RegAlloc
RegAlloc is in a wide sense includes all this tasks and more. For
some architectures, other tasks like a right live range splitting
might be even more important for generated code quality than just
better graph coloring.
* LRA and IRA is default Passes in RA for GCC:
$ /opt/gcc-git/bin/gcc hello.c
DEBUG: ../../gcc/lra.c, lra_init_once, line 2441
DEBUG: ../../gcc/ira-build.c, ira_build, line 3409
* Greedy is default Pass for LLVM
But I have some questions, please give me some hint, thanks a lot!
* IRA is regional register allocator performing graph coloring on a
top-down traversal of nested regions, is it Global? compares with
Local LRA
IRA is a global RA. The description of its initial version can be found
https://vmakarov.fedorapeople.org/vmakarov-submission-cgo2008.pdf
I am reading this paper at present :)
LRA in some way is also global RA but it is a very simplified version
of global RA (e.g. LRA does not use conflict graph and its coloring
algoritm is closer to priority coloring). LRA does a lot of other
very complicated things besides RA, for example instruction selection
which is quite specific to GCC machine description. Usually code
selection task is a separate pass in other compilers. Generally
speaking LRA is more complicated, machine dependent and more buggy
than IRA. But fortunately LRA is less complicated than its
predecessor so called reload pass.
IRA and LRA names have a long history and they do not reflect
correctly the current situation.
It would be possible to incorporate LRA tasks into IRA, but the final
RA would be much slower, even more complicated and hard to maintain
and the generated code would be not much better. So to improve RA
maintainability, RA is divided on two parts solving a bit different
tasks. This is a typical engineering approach.
I am debugging by printf to be familiar with LRA and IRA.
* The papers by Briggs and Chaiten contradict[2] themselves when
examine the text of the paper vs. the pseudocode provided?
I don't examine Preston Briggs work so thoroughly. So I can not say
that is true. Even so it is natural that there are discrepancy in
pseudocode and its description especially for such size description.
For me Preston Briggs is famous for his introduction of optimistic
coloring.
* Why interference graph is expensive to build[3]?
That is because it might be N^2 algorithm. There are a lot of
publications investigating building conflict graphs and its cost in RAs.
And I am practicing[4] to use HEA, developed by Dr. Rhydian Lewis,
for LLVM firstly.
When I just started to work on RAs very long ago I used about the same
approach: a lot of tiny transformations directed by a cost function
and using metaheuristics (I also used tabu search as HEA). Nothing
good came out of this.
Thanks for your lesson! But are there some benchmarks when you used Tabu
search as HEA, AntCol, etc. such as
https://pbs.twimg.com/media/DRD-kxcUMAAxZec.jpg
If you are interesting in RA algorithms and architectures, I'd
recommend Michael Matz article
ftp://gcc.gnu.org/pub/gcc/summit/2003/Graph%20Coloring%20Register%20Allocation.pdf
as a start point.
Thanks! I am reading it.
[1] https://reviews.llvm.org/D39712
[2] http://lists.llvm.org/pipermail/llvm-dev/2008-March/012940.html
[3] https://github.com/joaotavio/llvm-register-allocator
[4] https://github.com/xiangzhai/llvm/tree/avr/include/llvm/CodeGen/GCol
--
Regards,
Leslie Zhai - https://reviews.llvm.org/p/xiangzhai/