https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102943
--- Comment #48 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Andrew Macleod from comment #47) > Created attachment 52637 [details] > new patch > > I am working on a alternative cache for GCC 13, but along the way, I have > changes to the ranger_cache::range_from_dom() routine. The original version > gave up when it hit a block which had outgoing edges. The new version is > smarter and basically goes back until it finds a cache entry, and then > intersects all outgoing edge between the two places. It also removes the > recursion , and does not SET any cache values during the lookup (making it a > true query). > > The net effect of this is significant improvements in cache performance > because its used far less, but there is more time spend doing calculations. > This bootstraps and passes all regression tests. we do miss out on a few > minor opportunities (30 out of 4400 in all of EVRP over the GCC source) > which occur as a result of updated values not being propagated properly as > the cache is no longer "full" like it was before. > > IN GCC 13 I will address this, but I thought you might be interested in > trying this patch against this PR. > > In building 380 GCC source files, I see the following avg speedups > evrp : -22.57% > VRP2 : -5.4% > thread_jumps_full : -14.16% > total : -0.44% > > So it is not insignificant. > > It is likely to be most effective in large CFGs. > This is *total* compile time percent speed up for the 5 most significant > cases: > > expr.ii -2.62% > lra-constraints.ii -3.75% > caller-save.ii -3.98% > reload.ii -4.04% > optabs.ii -5.05% > > EVRP isolated speedups (yes, these are *percetage* speedup) > expr.ii -62.38 > simplify-rtx.ii -65.97 > lra-constraints.ii -67.87 > reload.ii trunk -68.67 > caller-save.ii trunk -71.93 > optabs.ii trunk -78.69 > > I think those times are probably worth the odd miss. > > Anyway, next time you are checking performance for this PR maybe also try > this patch and see how it performs. It helps quite a bit, the worst case is now tree VRP : 5.14 ( 7%) 0.02 ( 3%) 5.15 ( 7%) 2 9M ( 3%) backwards jump threading : 4.05 ( 6%) 0.00 ( 0%) 4.06 ( 6%) 222 0k ( 0%) overall the patch reduces compile time from 766s to 749 (parallel compile, serial LTO, release checking). So IMHO definitely worth it if you are happy with it.