https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95018

--- Comment #23 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #20)
> (In reply to Jiu Fu Guo from comment #18)
> > Currently, I'm thinking to enhance GCC 'cunroll' as:
> > if the loop has multi-exits or upbound is not a fixed number, we may not do
> > 'complete unroll' for the loop, except -funroll-all-loops is specified.
> 
> That doens't make much sense (-funroll-all-loops is RTL unroller only).
> 
> I think the growth limits are simply too large unless we compute a "win"
> which we in this case do not.  So I'd say the growth limits should scale
> with win ^ (1/new param) thus if we estimate to eliminate 20% of the
> loop stmts due to unrolling then the limit to apply is
> limit * (0.2 ^ (1/X)) with X maybe defaulting to 2.
> 
> I'd only apply this new limit for peeling (peeling is when the loop count
> is not constant and thus we keep the exit tests).
> 
> Of course people want more peeling (hello POWER people!)

Btw, the issue with the rs6000 code at present is that it uses
unroll_only_small_loops but that only affects the RTL unroller
while the enablement of -funroll-loops at -O2 affects GIMPLE
as well but unconstrained (with -O3 params).  For the main
unroll pass (not cunrolli) this triggers code size growth:

  unsigned int val = tree_unroll_loops_completely (flag_unroll_loops
                                                   || flag_peel_loops
                                                   || optimize >= 3, true);

the "original" patch also adjusted parameters.  If the intent is to only
affect the RTL unroller then we need a separate flag controlling it
(yeah, using the same flags as heuristic trigger was probably bad).

Reply via email to