https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91817

            Bug ID: 91817
           Summary: compile with -O3 is more-than-expectedly slower than
                    -O2
           Product: gcc
           Version: 4.6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hehaochen at hotmail dot com
  Target Milestone: ---

Created attachment 46898
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46898&action=edit
this test case is from gcc-45364

### This is normal in GCC 5.3.1(Docker version): ### 

root@ubuntu:/home/jxl# time gcc -m32 -o1 -g -c 1.i -o test.o
real 0m1.017s
user 0m0.876s
sys 0m0.072s

root@ubuntu:/home/jxl# time gcc -m32 -o2 -g -c 1.i -o test.o
real 0m1.250s
user 0m0.864s
sys 0m0.064s

root@ubuntu:/home/jxl# time gcc -m32 -o3 -g -c 1.i -o test.o
real 0m4.446s
user 0m0.700s
sys 0m0.476s


### HOWEVER, it is extremely slow in GCC 4.6(Docker version): ### 

root@4beb8027e1fb:/# time gcc -m32 -o1 -g -c 1.i -o test.o
real 0m3.066s
user 0m0.656s
sys 0m0.772s

root@4beb8027e1fb:/# time gcc -m32 -o2 -g -c 1.i -o test.o
real 0m1.112s
user 0m0.796s
sys 0m0.156s

root@4beb8027e1fb:/# time gcc -m32 -O3 -g -c 1.i -o test.o
real 2m55.363s
user 2m41.224s
sys 0m1.908s

###### Root cause locates in "var-tracking dataflow" ######

root@4beb8027e1fb:/# gcc -m32 -O3 -ftime-report -g -c 1.i -o test.o

Execution times (seconds)
 callgraph construction:   0.00 ( 0%) usr   0.01 ( 0%) sys   0.01 ( 0%) wall   
1432 kB ( 1%) ggc
 callgraph optimization:   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
 137 kB ( 0%) ggc
 ipa function splitting:   0.00 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
 528 kB ( 1%) ggc
 ipa pure const        :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
  25 kB ( 0%) ggc
 ipa SRA               :   0.00 ( 0%) usr   0.01 ( 0%) sys   0.01 ( 0%) wall   
 405 kB ( 0%) ggc
 cfg cleanup           :   0.08 ( 0%) usr   0.01 ( 0%) sys   0.11 ( 0%) wall   
 549 kB ( 1%) ggc
 trivially dead code   :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
   0 kB ( 0%) ggc
 df scan insns         :   0.03 ( 0%) usr   0.01 ( 0%) sys   0.04 ( 0%) wall   
   6 kB ( 0%) ggc
 df multiple defs      :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 df reaching defs      :   0.12 ( 0%) usr   0.02 ( 1%) sys   0.12 ( 0%) wall   
   0 kB ( 0%) ggc
 df live regs          :   0.35 ( 0%) usr   0.01 ( 0%) sys   0.35 ( 0%) wall   
   0 kB ( 0%) ggc
 df live&initialized regs:   0.20 ( 0%) usr   0.01 ( 0%) sys   0.25 ( 0%) wall 
     0 kB ( 0%) ggc
 df use-def / def-use chains:   0.04 ( 0%) usr   0.02 ( 1%) sys   0.07 ( 0%)
wall       0 kB ( 0%) ggc
 df live reg subwords  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 df reg dead/unused notes:   0.10 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall 
   585 kB ( 1%) ggc
 register information  :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 alias analysis        :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
2860 kB ( 3%) ggc
 alias stmt walking    :   0.11 ( 0%) usr   0.06 ( 2%) sys   0.17 ( 0%) wall   
   8 kB ( 0%) ggc
 register scan         :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   9 kB ( 0%) ggc
 rebuild jump labels   :   0.02 ( 0%) usr   0.01 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 preprocessing         :   0.07 ( 0%) usr   0.35 (14%) sys   0.45 ( 0%) wall   
3387 kB ( 3%) ggc
 lexical analysis      :   0.19 ( 0%) usr   0.57 (23%) sys   0.74 ( 0%) wall   
   0 kB ( 0%) ggc
 parser                :   0.13 ( 0%) usr   0.47 (19%) sys   0.72 ( 0%) wall  
18502 kB (18%) ggc
 inline heuristics     :   0.02 ( 0%) usr   0.01 ( 0%) sys   0.01 ( 0%) wall   
 133 kB ( 0%) ggc
 integration           :   0.07 ( 0%) usr   0.05 ( 2%) sys   0.14 ( 0%) wall   
8467 kB ( 8%) ggc
 tree gimplify         :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
4760 kB ( 5%) ggc
 tree CFG construction :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
1460 kB ( 1%) ggc
 tree CFG cleanup      :   0.04 ( 0%) usr   0.02 ( 1%) sys   0.11 ( 0%) wall   
 318 kB ( 0%) ggc
 tree VRP              :   0.10 ( 0%) usr   0.04 ( 2%) sys   0.14 ( 0%) wall   
2898 kB ( 3%) ggc
 tree copy propagation :   0.02 ( 0%) usr   0.01 ( 0%) sys   0.02 ( 0%) wall   
 197 kB ( 0%) ggc
 tree PTA              :   0.07 ( 0%) usr   0.02 ( 1%) sys   0.06 ( 0%) wall   
 582 kB ( 1%) ggc
 tree SSA rewrite      :   0.05 ( 0%) usr   0.01 ( 0%) sys   0.05 ( 0%) wall   
1884 kB ( 2%) ggc
 tree SSA other        :   0.02 ( 0%) usr   0.01 ( 0%) sys   0.03 ( 0%) wall   
  17 kB ( 0%) ggc
 tree SSA incremental  :   0.06 ( 0%) usr   0.01 ( 0%) sys   0.08 ( 0%) wall   
 706 kB ( 1%) ggc
 tree operand scan     :   0.05 ( 0%) usr   0.09 ( 4%) sys   0.13 ( 0%) wall   
4319 kB ( 4%) ggc
 dominator optimization:   0.05 ( 0%) usr   0.01 ( 0%) sys   0.03 ( 0%) wall   
1525 kB ( 1%) ggc
 tree CCP              :   0.06 ( 0%) usr   0.01 ( 0%) sys   0.06 ( 0%) wall   
 321 kB ( 0%) ggc
 tree split crit edges :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
 589 kB ( 1%) ggc
 tree reassociation    :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
 194 kB ( 0%) ggc
 tree PRE              :   0.06 ( 0%) usr   0.05 ( 2%) sys   0.10 ( 0%) wall   
 561 kB ( 1%) ggc
 tree FRE              :   0.08 ( 0%) usr   0.03 ( 1%) sys   0.09 ( 0%) wall   
 125 kB ( 0%) ggc
 tree code sinking     :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
  98 kB ( 0%) ggc
 tree forward propagate:   0.03 ( 0%) usr   0.02 ( 1%) sys   0.03 ( 0%) wall   
 293 kB ( 0%) ggc
 tree conservative DCE :   0.02 ( 0%) usr   0.04 ( 2%) sys   0.04 ( 0%) wall   
  34 kB ( 0%) ggc
 tree aggressive DCE   :   0.04 ( 0%) usr   0.01 ( 0%) sys   0.05 ( 0%) wall   
 738 kB ( 1%) ggc
 tree DSE              :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
 118 kB ( 0%) ggc
 PHI merge             :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
 241 kB ( 0%) ggc
 tree loop invariant motion:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%)
wall       0 kB ( 0%) ggc
 scev constant prop    :   0.00 ( 0%) usr   0.01 ( 0%) sys   0.00 ( 0%) wall   
 206 kB ( 0%) ggc
 complete unrolling    :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
 779 kB ( 1%) ggc
 tree slp vectorization:   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
1833 kB ( 2%) ggc
 tree loop distribution:   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   3 kB ( 0%) ggc
 tree iv optimization  :   0.04 ( 0%) usr   0.02 ( 1%) sys   0.04 ( 0%) wall   
2413 kB ( 2%) ggc
 tree copy headers     :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
 227 kB ( 0%) ggc
 tree SSA uncprop      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 dominance computation :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   0 kB ( 0%) ggc
 out of ssa            :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
  12 kB ( 0%) ggc
 expand vars           :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
 957 kB ( 1%) ggc
 expand                :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall   
8481 kB ( 8%) ggc
 varconst              :   0.01 ( 0%) usr   0.01 ( 0%) sys   0.00 ( 0%) wall   
   1 kB ( 0%) ggc
 forward prop          :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
 539 kB ( 1%) ggc
 CSE                   :   0.09 ( 0%) usr   0.01 ( 0%) sys   0.11 ( 0%) wall   
  87 kB ( 0%) ggc
 dead code elimination :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
   0 kB ( 0%) ggc
 dead store elim1      :   0.04 ( 0%) usr   0.01 ( 0%) sys   0.07 ( 0%) wall   
 554 kB ( 1%) ggc
 dead store elim2      :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
 600 kB ( 1%) ggc
 loop invariant motion :   0.26 ( 0%) usr   0.00 ( 0%) sys   0.26 ( 0%) wall   
   0 kB ( 0%) ggc
 loop unswitching      :   0.25 ( 0%) usr   0.00 ( 0%) sys   0.26 ( 0%) wall   
   0 kB ( 0%) ggc
 CPROP                 :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall   
1555 kB ( 1%) ggc
 PRE                   :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall   
 145 kB ( 0%) ggc
 CSE 2                 :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
  40 kB ( 0%) ggc
 branch prediction     :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
 242 kB ( 0%) ggc
 combiner              :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
1475 kB ( 1%) ggc
 if-conversion         :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
 385 kB ( 0%) ggc
 regmove               :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 integrated RA         :   0.21 ( 0%) usr   0.00 ( 0%) sys   0.23 ( 0%) wall   
1278 kB ( 1%) ggc
 reload                :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.14 ( 0%) wall   
 379 kB ( 0%) ggc
 reload CSE regs       :   0.12 ( 0%) usr   0.01 ( 0%) sys   0.15 ( 0%) wall   
1568 kB ( 1%) ggc
 load CSE after reload :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   6 kB ( 0%) ggc
 thread pro- & epilogue:   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
 128 kB ( 0%) ggc
 peephole 2            :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
 236 kB ( 0%) ggc
 hard reg cprop        :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   3 kB ( 0%) ggc
 scheduling 2          :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall   
  43 kB ( 0%) ggc
 machine dep reorg     :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
  46 kB ( 0%) ggc
 reorder blocks        :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
1016 kB ( 1%) ggc
 reg stack             :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   6 kB ( 0%) ggc
 final                 :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
1040 kB ( 1%) ggc
 symout                :   0.06 ( 0%) usr   0.04 ( 2%) sys   0.10 ( 0%) wall  
12092 kB (12%) ggc
 variable tracking     :   0.32 ( 0%) usr   0.01 ( 0%) sys   0.32 ( 0%) wall   
2201 kB ( 2%) ggc
 var-tracking dataflow : 157.01 (96%) usr   0.32 (13%) sys 158.01 (95%) wall   
   0 kB ( 0%) ggc
 var-tracking emit     :   1.39 ( 1%) usr   0.01 ( 0%) sys   1.40 ( 1%) wall   
 726 kB ( 1%) ggc
 rest of compilation   :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.15 ( 0%) wall   
 565 kB ( 1%) ggc
 remove unused locals  :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
   0 kB ( 0%) ggc
 address taken         :   0.00 ( 0%) usr   0.01 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 unaccounted todo      :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
   0 kB ( 0%) ggc
 rebuild frequencies   :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
  94 kB ( 0%) ggc
 repair loop structures:   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
  72 kB ( 0%) ggc
 TOTAL                 : 163.80             2.46           167.08            
104717 kB

Reply via email to