Jan Hubicka <[email protected]> writes: > With -O2 we automatically enable several loop optimizations with > -fprofile-use. > The rationale is that those optimizations at -O3 only mainly since they may > hurt performance or not pay back in code size when used blindly on all loops. > Profile feedback gives us data on number of iterations which is used by > heuristics > controlling those optimizations. > > Currently auto-FDO is not that good on determining number of iterations so I > think we > do not want to enable them until we can prove that those are useful. > This is affecting primarily -O2 codegen.
can you elaborate what the problem is? afaik even the instrumented profile just has edge frequencies too. Is the problem the sampling? > Theoretically auto-FdO with lbr can be pretty good on estimating # of > iterations, but to make it useful we will need to implement multiplicity for > discriminators at least. If it's <= 32/number of taken BBs in loop. A better way might to sample registers and use the register containing the loop count. It would need infrastructure to map back variables however. > > I also noticed that optimize_crc was added to the list but nothing in > the pass seems to really rely on profile... crc_optimization::satisfies_crc_loop_iteration_count does? -Andi
