https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59811
--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> --- So SCCVN is limited with --param sccvn-max-alias-queries-per-access which has a quite high default value (1000). There isn't something more elaborate like having an overall budget as well. With lowering the default value to 100 I get to alias stmt walking : 12.15 (16%) usr 0.06 (12%) sys 12.23 (16%) wall 2 kB ( 0%) ggc ... tree PRE : 9.48 (13%) usr 0.01 ( 2%) sys 9.48 (13%) wall 1333 kB ( 0%) ggc tree FRE : 9.91 (13%) usr 0.03 ( 6%) sys 9.92 (13%) wall 310 kB ( 0%) ggc ... combiner : 31.87 (43%) usr 0.24 (46%) sys 32.10 (43%) wall 528441 kB (80%) ggc rest of compilation : 51.05 (69%) usr 0.31 (60%) sys 51.25 (69%) wall 620347 kB (94%) ggc TOTAL : 73.85 0.52 74.34 661274 kB (see disclaimer about things not adding up). As we run FRE twice and PRE only once (but it uses a cheaper walking mode that ends up giving up earlier) a reasonable result would be FRE at around 2 times the PRE time. Note that we do run into the above limit all the time with this testcase which has loads of stores and loads (non-aliased, of course). And we do seem to optimize stuff as upping the limit results in faster overall compile time. I think we need to more intelligently limit the walking and/or design some caching of queries that works better or simply make the queries cheaper. Most of the query cost comes from decomposing our reference trees and repeatedly computing the overall access offset/size (get_ref_base_and_extent). So if we could cache that we might win. The number of loads/stores are not that bad in the testcase, we have 1700 stores after fre1 and 547 loads, 1142 stores after fre2 and 547 loads. Before fre1 we had 1714 stores and 15334 loads. So its clearly fre1 that will run into this compile-time hog (it does max. #loads * #stores alias queries). OTOH we _do_ want to optimize those 14000 redundant loads! It simply looks like the defs are far away (happens for example if you have an initializer at the start of the function we can optimize from). The 547 loads after fre1 is actually with upping the limit to 10000, with the default limit we still have 5548 loads! With lowering to 100 we get 11616 loads. Note that compile-time would be worse if there were nothing to optimize here (then you hit the worst-case of #loads * #stores alias queries).