Jan Hubicka wrote:
IP RA as currenly implemented in IRA does propagate info only down in
topological order. But a good IP RA (e.g. Minimal cost inter-procedural
regiter allocator http://citeseer.ist.psu.edu/kurlander96minimum.html)
needs to propagate info up and down.
But I am quite skeptical about IP RA. To be sucessfull it needs that
the called function is small and uses few registers. Such functions
should be just inlined. It solves IP RA besides other problems.
Intel/Sun/Pathscale compilers make very aggressive function inlining
during LTO. Therefore the code generated by the compilers with LTO is
much bigger (e.g. LTO in pathscale results in 30% bigger code for x86_64
SPECINT2000 and by the way improves the code by 4% and makes the
compiler 50% slower).
It seems to me that you can afford such a tradeoff if you are targetting
primarily CPU houngry software. We need to handle the other cases (ie
system utilities or openoffice, say) where the code size is about major
bottleneck, so it seems to me that we don't want to make such an extreme
tradeoffs (though our current implementation of inliner heruistics will
do precisely that too)
I am agree with you. We should not forget embedded market where the
code size is more imortant but we should provide an option for market
on which Pathscale/Intel/Sun compilers are oriented.
With this point of view, we have a lot of resources because gcc
generates the smallest code. E.g. In averages Intel compiler generates
80% bigger code for SPECINT2000 with -O2 and 7 times bigger with -fast.
It generates even bigger code with -Os (code size optimizations) than
GCC for -O3. The same is about the compilation speed. GCC with -O3 is
much faster (30%-40%) than all mentioned compilers in peak performance
mode (-fast, -Ofast). So I think we have some resouces which we could
spend to improve the code performance for this market.