On Thu, Mar 28, 2019 at 09:55:46AM +0100, Richard Biener wrote: > On Wed, Mar 27, 2019 at 4:26 PM Jeff Law <l...@redhat.com> wrote: > > > > On 3/27/19 8:36 AM, Jakub Jelinek wrote: > > > On Sun, Mar 24, 2019 at 09:20:07AM -0600, Jeff Law wrote: > > >> However, I'm increasingly of the opinion that MIPS targets need to drop > > >> off the priority platform list. Given the trajectory I see for MIPS > > >> based processors in industry, it's really hard to justify spending this > > >> much time on them, particularly for low priority code quality issues. > > > > > > Besides what has been discussed on IRC for the PR89826 fix, that we really > > > need a df_analyze before processing the first block, because otherwise we > > > can't rely on the REG_UNUSED notes in the IL, I see some other issues, > > > but I > > > admit I don't know much about df nor regcprop. > > RIght. I plan to commit that today along with the test reordering you > > pointed out. > > > > > > > > 1) the df_analyze () after every (successful) processing of a basic block > > > is IMHO way too expensive, I would be very surprised if df_analyze () > > > isn't > > > quadratic in number of basic blocks and so one could construct testcases > > > with millions of basic blocks and at least one regcprop change in each bb > > > and get at cubic complexity (correct me if I'm wrong, and I'm aware of the > > > 95% bbs you said won't have any changes at all) > > I'm going to look this further today. > > Look at > https://gcc.opensuse.org/gcc-old/c++bench-czerny/random/random-performance-latest > and you'll see multiple testcases with 'hard reg cprop' >10% compile-time. > It's indeed a hog for no good reason.
I've tried https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071#c1 in --enable-checking=yes,rtl,extra bootstrapped cc1 at -O2, without and with the patch. The important times in -ftime-report with vanilla trunk: phase opt and generate : 250.76 (100%) 2.00 ( 96%) 253.36 (100%) 768860 kB ( 99%) df live regs : 19.95 ( 8%) 0.03 ( 1%) 19.39 ( 8%) 0 kB ( 0%) df live&initialized regs : 20.29 ( 8%) 0.05 ( 2%) 19.73 ( 8%) 0 kB ( 0%) df reg dead/unused notes : 158.66 ( 63%) 0.02 ( 1%) 160.12 ( 63%) 4665 kB ( 1%) hard reg cprop : 21.03 ( 8%) 0.01 ( 0%) 21.39 ( 8%) 509 kB ( 0%) TOTAL : 250.85 2.09 253.57 776940 kB (ignoring everything <2% in the first % column). Configure with --enable-checking=release to disable checks. With the https://gcc.gnu.org/ml/gcc-patches/2019-03/msg01335.html patch the same testcase with -O2 -ftime-report results in identical assembly, but: phase opt and generate : 28.92 (100%) 1.82 ( 95%) 30.85 ( 99%) 768882 kB ( 99%) CFG verifier : 1.66 ( 6%) 0.02 ( 1%) 1.69 ( 5%) 0 kB ( 0%) df live regs : 0.63 ( 2%) 0.00 ( 0%) 0.61 ( 2%) 0 kB ( 0%) df live&initialized regs : 1.01 ( 3%) 0.03 ( 2%) 1.00 ( 3%) 0 kB ( 0%) df must-initialized regs : 1.51 ( 5%) 0.93 ( 48%) 2.46 ( 8%) 0 kB ( 0%) tree SSA verifier : 2.79 ( 10%) 0.01 ( 1%) 2.78 ( 9%) 0 kB ( 0%) tree STMT verifier : 2.00 ( 7%) 0.00 ( 0%) 1.99 ( 6%) 0 kB ( 0%) dominance computation : 0.61 ( 2%) 0.00 ( 0%) 0.59 ( 2%) 0 kB ( 0%) out of ssa : 0.61 ( 2%) 0.04 ( 2%) 0.65 ( 2%) 1 kB ( 0%) loop init : 0.58 ( 2%) 0.00 ( 0%) 0.63 ( 2%) 38 kB ( 0%) combiner : 0.44 ( 2%) 0.02 ( 1%) 0.47 ( 2%) 17926 kB ( 2%) integrated RA : 2.24 ( 8%) 0.08 ( 4%) 2.35 ( 8%) 205177 kB ( 26%) LRA non-specific : 1.46 ( 5%) 0.05 ( 3%) 1.50 ( 5%) 19172 kB ( 2%) LRA create live ranges : 1.23 ( 4%) 0.00 ( 0%) 1.23 ( 4%) 2589 kB ( 0%) reload CSE regs : 0.54 ( 2%) 0.00 ( 0%) 0.51 ( 2%) 8456 kB ( 1%) scheduling 2 : 0.73 ( 3%) 0.09 ( 5%) 0.81 ( 3%) 2715 kB ( 0%) verify RTL sharing : 1.19 ( 4%) 0.00 ( 0%) 1.15 ( 4%) 0 kB ( 0%) TOTAL : 29.02 1.92 31.07 776962 kB So 8.5x usr time speedup with that patch. Jakub