On Fri, 2006-06-23 at 15:07 -0700, Ian Lance Taylor wrote:

> You omitted the RTL loop optimizer passes, which still do quite a bit
> of work despite the tree-ssa loop passes.  Also if-conversion and some
> minor passes, though they are less relevant.

Which brings up a good discussion. I presume the rtl loop optimizers see
things exposed by addressing modes which aren't seen in the higher level
code. I wonder what the "big gains" are here... and if they are
detectable at expansion time...   

In general, I didnt mention anything that tends not to increase register
pressure, at least not in any significant manner as far as RABLET is
concerned.

> 
> > If expand is made much smarter, I would argue that much of GCSE and CSE
> > isn't needed.  We've already performed those optimizations at  a high
> > level, and we can hopefully do a lot of the factoring and things on
> > addressing registers exposed during expand.  I'm sure there are other
> > things to do, but I would argue that they are significantly less than a
> > "general purpose" CSE and GCSE pass. And in the cases of high register
> > pressure, how much would you want them to do anyway?  Its really these
> > high register pressure areas that RABLET is attacking anyway.
> 
> Here I think you are waving your hands a little too hard.  RTL level
> CSE is significant for handling common expressions exposed by address
> calculations and by DImode (and larger) computations.  On some
> processors giving up CSE on address calculations would be very
> painful.  There needs to be a plan to handle that.
> 

Yes, there is some hand waving, mostly because I haven't gotten that far
in details yet. I expect to be able to do some of this type of commoning
at rtl generation time as things are generated. (much like RABLE's
spiller reuses spill loads nearby). That may turn out to be more
difficult than I anticipate however. Pain is in the implementation :-)

I am not proposing that CSE necessarily be eliminated *all* the time,
but in cases when register pressure is already excessively high, is
further commoning of DImode values going to make things better? Its
really this case I'm interested in evaluating since this is the case we
already have problems. if we don't spill, RABLET would effectively do
nothing.

Clearly there will be a lot of further investigation required once
implementation reaches this point. Ultimately CSE and all RTL
optimizations can be re-evaluated to see if things can be simplified.

> Also at present may vector calculations are not exposed at the tree
> level--they are hidden inside builtin functions until they are
> expanded--and vector heavy code can also have a lot of common
> subexpressions.
> 

I have no plan at moment for vector operations :-). That could change,
but for now we'll have to keep whatever we do today for those.

> 
> > If I recall, scheduling is register pressure aware and normally doesn't
> > increase register pressure dramatically. If it does increase pressure,
> > well, this won't solve every problem after all.
> 
> Unfortunately, scheduling is currently not register pressure aware at
> all.  The scheduler will gleefully increase register pressure.  That's
> why we don't even run the scheduler before register allocation on x86.
> 

hum, too bad. for some reason I was under the impression that it at
least tried not to increase register pressure when it was above a
certain threshold value. Not running it at least means it wont increase
register pressure, so that works :-)
  
> 
> Modulo the above comments, I don't see anything wrong with your basic
> idea.  But I also wonder whether you couldn't get a similar effect by
> forcing instruction selection to occur before register allocation.  If
> that is done well, reload will have much less work to do.
> 

That was one of the premises of RABLE. Since out of ssa needs some TLC
and TER has been a wart for years, this seems like a good way of dealing
with those issues, and perhaps dealing with some significant RA issues
at the same time. (Anything to avoid actually rewriting RA eh!)


> One of the basic issues with the current code is not that we do
> register allocation well or poorly, but that reload takes the output
> of the register allocator and munges it unpredictably.  That's going
> to happen with your proposal as well.  It doesn't mean that your
> proposal won't improve things.  But no register allocator can do a
> good job when it can't make the final decisions.
> 
Truer words have never been spoken. RABLET makes no attempt to do
anything about reload. It simply attempts to present the backend with
code that isn't full of excessive register pressure. If it turns out to
be something reload screws up today, it will continue to be screwed up.
I suspect a lot of the time we do have excessive spill, RABLET will show
benefit. 

Its clearly not as good as a new register allocator would be, but the
effort to benefit ratio ought to be a lot higher for RABLET than for a
register allocator rewrite.

Andrew

Reply via email to