wristow added a comment.

> This probably needs to be taken over by someone who cares about full LTO 
> performance

We at PlayStation are definitely interested in full LTO performance, so we're 
looking into this.  We certainly agree with the rationale that if suppressing 
some optimizations is useful to allow better SamplePGO matching, then we'd 
expect that would apply equally to both ThinLTO and full LTO.

I guess much of this comes down to a balancing act between:

1. The amount of the runtime benefit with Sample PGO if these loop 
optimizations are deferred to the full LTO back-end (like they are for ThinLTO).
2. The cost in compile-time resources in the full LTO back-end to do these loop 
optimizations at that later stage.

From the discussion here, the Sample PGO runtime win (point 1) seems more or 
less to be a given.  If we find the compile-time cost in the full LTO back-end 
(point 2) is not significant, then the decision should be easy.  So after 
seeing this patch, we're doing some experiments to at least try to get a handle 
on this.  (I'm a bit concerned we won't be able to draw any hard conclusions 
from the results of our experiments, but at least we'll be able to make a 
better informed assessment.)

FTR, for PlayStation, we're using the old PM.  But we'll do some experiments 
for both the old and new PM, to get a sense of the answers to the (old PM) 
`LoopUnrollAndJam` point, and the (new PM) FIXME comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69732/new/

https://reviews.llvm.org/D69732



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to