https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103734

--- Comment #1 from hubicka at kam dot mff.cuni.cz ---
I think ipa-cp heuristics still needs some work.  It is nice that we got
it to do something, but I just checked and with LTO+PGO build of clang
it produces cca 30 clones that are not "for all known contexts", so it
seems that it is still quite strict on what it considers to clone at
least with FDO.

On the other hand we have PR103195 where tfft2 grows by 70% becuase
function is cloned with no great benefits.

I think one problem is that it is based on absolute time improvements.
This has problems because bigger functions run longer and more likely
to see bigger absolute improvement, but we are more interested in
relative improvement (i.e. duplicating very large fucntion to get 0.001%
speedup is less useful than duplicating small function that get 10%
speedup even if absolute number is same).

With profile feedback absolute numbers are OK since they should
translate to actual speedups of the whole program.  But then longer
trained programs will see bigger numbers than shorter trained numbers so
perhaps this should be relative to overall running time of the program?

Reply via email to