[Bug tree-optimization/102943] [12 Regression] Jump threader compile-time hog with 521.wrf_r

rguenth at gcc dot gnu.org via Gcc-bugs Thu, 17 Mar 2022 04:14:54 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102943


--- Comment #48 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Andrew Macleod from comment #47)
> Created attachment 52637 [details]
> new patch
> 
> I am working on a alternative cache for GCC 13, but along the way, I have
> changes to the ranger_cache::range_from_dom() routine.  The original version
> gave up when it hit a block which had outgoing edges. The new version is
> smarter and basically goes back until it finds a cache entry, and then
> intersects all outgoing edge between the two places. It also removes the
> recursion , and does not SET any cache values during the lookup (making it a
> true query).
> 
> The net effect of this is significant improvements in cache performance
> because its used far less, but there is more time spend doing calculations.
> This bootstraps and passes all regression tests.  we do miss out on a few
> minor opportunities (30 out of 4400 in all of EVRP over the GCC source) 
> which occur as a result of updated values not being propagated properly as
> the cache is no longer "full" like it was before.  
> 
> IN GCC 13 I will address this, but I thought you might be interested in
> trying this patch against this PR.
> 
> In building 380 GCC source files, I see the following avg speedups
> evrp : -22.57%
> VRP2 : -5.4%
> thread_jumps_full : -14.16%
> total : -0.44%
> 
> So it is not insignificant.
> 
> It is likely to be most effective in large CFGs.
> This is *total* compile time percent speed up for the 5 most significant
> cases:
> 
> expr.ii  -2.62%
> lra-constraints.ii -3.75%
> caller-save.ii -3.98%
> reload.ii -4.04%
> optabs.ii -5.05%
> 
> EVRP isolated speedups (yes, these are *percetage* speedup)
> expr.ii -62.38
> simplify-rtx.ii  -65.97
> lra-constraints.ii  -67.87
> reload.ii trunk  -68.67
> caller-save.ii trunk  -71.93
> optabs.ii trunk  -78.69
> 
> I think those times are probably worth the odd miss.
> 
> Anyway, next time you are checking performance for this PR maybe also try
> this patch and see how it performs.

It helps quite a bit, the worst case is now

 tree VRP                           :   5.14 (  7%)   0.02 (  3%)   5.15 (  7%)
   2
9M (  3%)
 backwards jump threading           :   4.05 (  6%)   0.00 (  0%)   4.06 (  6%)
 222
0k (  0%)

overall the patch reduces compile time from 766s to 749 (parallel compile,
serial LTO, release checking).  So IMHO definitely worth it if you are happy
with it.

[Bug tree-optimization/102943] [12 Regression] Jump threader compile-time hog with 521.wrf_r

Reply via email to