Jan Hubicka <[email protected]> writes:

> With -O2 we automatically enable several loop optimizations with 
> -fprofile-use.
> The rationale is that those optimizations at -O3 only mainly since they may
> hurt performance or not pay back in code size when used blindly on all loops.
> Profile feedback gives us data on number of iterations which is used by 
> heuristics
> controlling those optimizations.
>
> Currently auto-FDO is not that good on determining number of iterations so I 
> think we
> do not want to enable them until we can prove that those are useful.
> This is affecting primarily -O2 codegen.

can you elaborate what the problem is?

afaik even the instrumented profile just has edge frequencies too. Is
the problem the sampling?

> Theoretically auto-FdO with lbr can be pretty good on estimating # of
> iterations, but to make it useful we will need to implement multiplicity for
> discriminators at least.

If it's <= 32/number of taken BBs in loop.

A better way might to sample registers and use the register
containing the loop count.  It would need infrastructure to map back
variables however.

>
> I also noticed that optimize_crc was added to the list but nothing in
> the pass seems to really rely on profile...

crc_optimization::satisfies_crc_loop_iteration_count does?


-Andi

Reply via email to