https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60243

--- Comment #26 from Martin Jambor <jamborm at gcc dot gnu.org> ---
With new IPA-SRA, the situation has improved quite a bit, see below
where old-ipa-sra is trunk r275981 and new-ipa-sra is trunk r275982
(arrival of new IPA-SRA):

$ /usr/bin/time -f 'real=%e user=%U' taskset -c 0
~/gcc/old-ipa-sra/inst/bin/gcc -O0 -fno-inline -S pr60243.c
real=64.20 user=63.37

$ /usr/bin/time -f 'real=%e user=%U' taskset -c 0
~/gcc/old-ipa-sra/inst/bin/gcc -O1 -fno-inline -S pr60243.c 
real=90.80 user=89.84

$ /usr/bin/time -f 'real=%e user=%U' taskset -c 0
~/gcc/old-ipa-sra/inst/bin/gcc -O2 -S pr60243.c 
real=235.18 user=233.77

$ /usr/bin/time -f 'real=%e user=%U' taskset -c 0
~/gcc/old-ipa-sra/inst/bin/gcc -O2 -fno-inline -S pr60243.c 
real=198.59 user=197.27

$ /usr/bin/time -f 'real=%e user=%U' taskset -c 0
~/gcc/new-ipa-sra/inst/bin/gcc -O2 -S pr60243.c 
real=114.68 user=113.76

$ /usr/bin/time -f 'real=%e user=%U' taskset -c 0
~/gcc/new-ipa-sra/inst/bin/gcc -O2 -fno-inline -S pr60243.c 
real=88.40 user=87.41


$ taskset -c 0 ~/gcc/new-ipa-sra/inst/bin/gcc -O2 -S pr60243.c -ftime-report
(showing only IPA passes and passes taking more than 1% of usr time)
 phase parsing                      :   9.57 (  8%)   6.93 ( 75%)  16.51 ( 13%)
 655448 kB ( 20%)
 phase opt and generate             : 105.13 ( 92%)   2.34 ( 25%) 107.83 ( 87%)
2619926 kB ( 80%)
 callgraph functions expansion      :  18.05 ( 16%)   1.34 ( 14%)  19.71 ( 16%)
 302442 kB (  9%)
 callgraph ipa passes               :  77.51 ( 68%)   0.50 (  5%)  78.06 ( 63%)
 623696 kB ( 19%)
 ipa function summary               :   0.15 (  0%)   0.01 (  0%)   0.16 (  0%)
   1494 kB (  0%)
 ipa dead code removal              :   0.32 (  0%)   0.00 (  0%)   0.29 (  0%)
      0 kB (  0%)
 ipa cp                             :   1.10 (  1%)   0.05 (  1%)   1.13 (  1%)
 326688 kB ( 10%)
 ipa inlining heuristics            :  17.85 ( 16%)   0.06 (  1%)  17.82 ( 14%)
  83762 kB (  3%)
 ipa function splitting             :   0.00 (  0%)   0.00 (  0%)   0.03 (  0%)
      0 kB (  0%)
 ipa various optimizations          :   0.63 (  1%)   0.28 (  3%)   0.96 (  1%)
 131752 kB (  4%)
 ipa reference                      :   0.06 (  0%)   0.00 (  0%)   0.06 (  0%)
      0 kB (  0%)
 ipa profile                        :  14.66 ( 13%)   0.00 (  0%)  14.67 ( 12%)
      0 kB (  0%)
 ipa pure const                     :   0.36 (  0%)   0.04 (  0%)   0.60 (  0%)
      0 kB (  0%)
 ipa icf                            :   0.17 (  0%)   0.01 (  0%)   0.19 (  0%)
      0 kB (  0%)
 ipa SRA                            :   0.21 (  0%)   0.00 (  0%)   0.23 (  0%)
    102 kB (  0%)
 ipa free inline summary            :   0.05 (  0%)   0.00 (  0%)   0.04 (  0%)
      0 kB (  0%)
 preprocessing                      :   4.20 (  4%)   3.31 ( 36%)   7.77 (  6%)
 384133 kB ( 12%)
 lexical analysis                   :   2.46 (  2%)   1.80 ( 19%)   3.95 (  3%)
      0 kB (  0%)
 parser function body               :   2.71 (  2%)   1.82 ( 20%)   4.57 (  4%)
 269874 kB (  8%)
 early inlining heuristics          :  12.82 ( 11%)   0.03 (  0%)  12.71 ( 10%)
   4031 kB (  0%)
 inline parameters                  :   8.01 (  7%)   0.12 (  1%)   8.27 (  7%)
  30845 kB (  1%)
 tree CFG construction              :   5.23 (  5%)   0.04 (  0%)   5.03 (  4%)
 628095 kB ( 19%)
 tree SSA rewrite                   :   3.42 (  3%)   0.02 (  0%)   3.39 (  3%)
  93305 kB (  3%)
 tree operand scan                  :  17.53 ( 15%)   0.26 (  3%)  17.77 ( 14%)
  96568 kB (  3%)

Essentially, -O2 -fno-inline is now as fast as -O1 -fno-inline.

Reply via email to