------- Additional Comments From rguenth at gcc dot gnu dot org 2005-09-18 18:16 ------- -ftime-report for the 4.1 + flatten compile:
Execution times (seconds) garbage collection : 6.32 ( 4%) usr 0.07 ( 1%) sys 6.73 ( 4%) wall 0 kB ( 0%) ggc callgraph construction: 0.42 ( 0%) usr 0.03 ( 0%) sys 0.42 ( 0%) wall 5274 kB ( 0%) ggc callgraph optimization: 0.12 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall 1605 kB ( 0%) ggc ipa reference : 0.49 ( 0%) usr 0.00 ( 0%) sys 0.49 ( 0%) wall 440 kB ( 0%) ggc ipa pure const : 0.13 ( 0%) usr 0.01 ( 0%) sys 0.14 ( 0%) wall 0 kB ( 0%) ggc ipa type escape : 4.88 ( 3%) usr 0.00 ( 0%) sys 4.88 ( 3%) wall 0 kB ( 0%) ggc cfg construction : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall 2530 kB ( 0%) ggc cfg cleanup : 1.29 ( 1%) usr 0.01 ( 0%) sys 1.16 ( 1%) wall 2256 kB ( 0%) ggc trivially dead code : 0.70 ( 0%) usr 0.01 ( 0%) sys 0.62 ( 0%) wall 0 kB ( 0%) ggc life analysis : 3.52 ( 2%) usr 0.00 ( 0%) sys 3.64 ( 2%) wall 2601 kB ( 0%) ggc life info update : 0.39 ( 0%) usr 0.00 ( 0%) sys 0.29 ( 0%) wall 590 kB ( 0%) ggc alias analysis : 1.41 ( 1%) usr 0.00 ( 0%) sys 1.13 ( 1%) wall 12731 kB ( 1%) ggc register scan : 0.69 ( 0%) usr 0.01 ( 0%) sys 0.87 ( 0%) wall 526 kB ( 0%) ggc rebuild jump labels : 0.21 ( 0%) usr 0.00 ( 0%) sys 0.16 ( 0%) wall 0 kB ( 0%) ggc preprocessing : 0.75 ( 0%) usr 0.37 ( 5%) sys 1.15 ( 1%) wall 686 kB ( 0%) ggc parser : 4.32 ( 2%) usr 0.98 (13%) sys 5.24 ( 3%) wall 229494 kB (11%) ggc name lookup : 2.08 ( 1%) usr 0.88 (12%) sys 2.88 ( 2%) wall 46108 kB ( 2%) ggc inline heuristics : 1.02 ( 1%) usr 0.04 ( 1%) sys 1.06 ( 1%) wall 36310 kB ( 2%) ggc integration : 12.02 ( 7%) usr 0.02 ( 0%) sys 11.77 ( 6%) wall 693907 kB (34%) ggc tree gimplify : 0.65 ( 0%) usr 0.03 ( 0%) sys 0.83 ( 0%) wall 11198 kB ( 1%) ggc tree eh : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 107 kB ( 0%) ggc tree CFG construction : 0.04 ( 0%) usr 0.02 ( 0%) sys 0.10 ( 0%) wall 14527 kB ( 1%) ggc tree CFG cleanup : 3.77 ( 2%) usr 0.11 ( 2%) sys 3.98 ( 2%) wall 16679 kB ( 1%) ggc tree VRP : 3.57 ( 2%) usr 0.13 ( 2%) sys 3.69 ( 2%) wall 22691 kB ( 1%) ggc tree copy propagation : 3.09 ( 2%) usr 0.04 ( 1%) sys 3.09 ( 2%) wall 3066 kB ( 0%) ggc tree store copy prop : 0.59 ( 0%) usr 0.03 ( 0%) sys 0.42 ( 0%) wall 652 kB ( 0%) ggc tree find ref. vars : 1.36 ( 1%) usr 0.05 ( 1%) sys 1.48 ( 1%) wall 86797 kB ( 4%) ggc tree PTA : 12.39 ( 7%) usr 0.06 ( 1%) sys 12.36 ( 7%) wall 32031 kB ( 2%) ggc tree alias analysis : 9.35 ( 5%) usr 0.84 (11%) sys 10.62 ( 6%) wall 68682 kB ( 3%) ggc tree PHI insertion : 1.40 ( 1%) usr 0.01 ( 0%) sys 1.49 ( 1%) wall 21821 kB ( 1%) ggc tree SSA rewrite : 4.88 ( 3%) usr 0.05 ( 1%) sys 4.67 ( 2%) wall 108845 kB ( 5%) ggc tree SSA other : 1.19 ( 1%) usr 0.47 ( 6%) sys 1.72 ( 1%) wall 1481 kB ( 0%) ggc tree SSA incremental : 12.44 ( 7%) usr 0.23 ( 3%) sys 12.44 ( 7%) wall 30571 kB ( 1%) ggc tree operand scan : 9.20 ( 5%) usr 2.05 (28%) sys 11.56 ( 6%) wall 68307 kB ( 3%) ggc dominator optimization: 9.49 ( 5%) usr 0.10 ( 1%) sys 9.60 ( 5%) wall 78640 kB ( 4%) ggc tree SRA : 0.50 ( 0%) usr 0.02 ( 0%) sys 0.57 ( 0%) wall 11723 kB ( 1%) ggc tree STORE-CCP : 0.62 ( 0%) usr 0.00 ( 0%) sys 0.68 ( 0%) wall 447 kB ( 0%) ggc tree CCP : 1.38 ( 1%) usr 0.01 ( 0%) sys 1.30 ( 1%) wall 2024 kB ( 0%) ggc tree split crit edges : 0.16 ( 0%) usr 0.01 ( 0%) sys 0.22 ( 0%) wall 18294 kB ( 1%) ggc tree reassociation : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 4 kB ( 0%) ggc tree PRE : 2.88 ( 2%) usr 0.03 ( 0%) sys 3.01 ( 2%) wall 27185 kB ( 1%) ggc tree FRE : 4.40 ( 2%) usr 0.06 ( 1%) sys 4.43 ( 2%) wall 41584 kB ( 2%) ggc tree code sinking : 0.36 ( 0%) usr 0.01 ( 0%) sys 0.49 ( 0%) wall 79 kB ( 0%) ggc tree linearize phis : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 10 kB ( 0%) ggc tree forward propagate: 1.00 ( 1%) usr 0.22 ( 3%) sys 1.19 ( 1%) wall 49760 kB ( 2%) ggc tree conservative DCE : 2.32 ( 1%) usr 0.00 ( 0%) sys 2.21 ( 1%) wall 0 kB ( 0%) ggc tree aggressive DCE : 0.48 ( 0%) usr 0.00 ( 0%) sys 0.50 ( 0%) wall 0 kB ( 0%) ggc tree DSE : 0.46 ( 0%) usr 0.00 ( 0%) sys 0.37 ( 0%) wall 760 kB ( 0%) ggc PHI merge : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 747 kB ( 0%) ggc tree loop bounds : 0.71 ( 0%) usr 0.00 ( 0%) sys 0.73 ( 0%) wall 5718 kB ( 0%) ggc loop invariant motion : 0.53 ( 0%) usr 0.00 ( 0%) sys 0.54 ( 0%) wall 185 kB ( 0%) ggc tree canonical iv : 0.18 ( 0%) usr 0.00 ( 0%) sys 0.17 ( 0%) wall 4380 kB ( 0%) ggc scev constant prop : 0.30 ( 0%) usr 0.00 ( 0%) sys 0.31 ( 0%) wall 4656 kB ( 0%) ggc complete unrolling : 1.72 ( 1%) usr 0.07 ( 1%) sys 1.63 ( 1%) wall 32424 kB ( 2%) ggc tree iv optimization : 1.43 ( 1%) usr 0.01 ( 0%) sys 1.33 ( 1%) wall 29199 kB ( 1%) ggc tree loop init : 0.59 ( 0%) usr 0.02 ( 0%) sys 0.45 ( 0%) wall 12 kB ( 0%) ggc tree loop fini : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc tree copy headers : 0.41 ( 0%) usr 0.01 ( 0%) sys 0.63 ( 0%) wall 19708 kB ( 1%) ggc tree SSA uncprop : 0.09 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall 0 kB ( 0%) ggc tree SSA to normal : 1.42 ( 1%) usr 0.00 ( 0%) sys 1.46 ( 1%) wall 15217 kB ( 1%) ggc tree rename SSA copies: 0.56 ( 0%) usr 0.00 ( 0%) sys 0.64 ( 0%) wall 1 kB ( 0%) ggc dominance frontiers : 0.61 ( 0%) usr 0.00 ( 0%) sys 0.77 ( 0%) wall 0 kB ( 0%) ggc control dependences : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 0 kB ( 0%) ggc expand : 7.56 ( 4%) usr 0.10 ( 1%) sys 7.63 ( 4%) wall 85819 kB ( 4%) ggc varconst : 0.23 ( 0%) usr 0.00 ( 0%) sys 0.22 ( 0%) wall 615 kB ( 0%) ggc jump : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.13 ( 0%) wall 189 kB ( 0%) ggc CSE : 6.53 ( 4%) usr 0.01 ( 0%) sys 6.65 ( 4%) wall 6438 kB ( 0%) ggc loop analysis : 1.15 ( 1%) usr 0.00 ( 0%) sys 1.06 ( 1%) wall 6947 kB ( 0%) ggc global CSE : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc CPROP 1 : 0.52 ( 0%) usr 0.00 ( 0%) sys 0.51 ( 0%) wall 4188 kB ( 0%) ggc PRE : 1.57 ( 1%) usr 0.00 ( 0%) sys 1.56 ( 1%) wall 2273 kB ( 0%) ggc CPROP 2 : 0.67 ( 0%) usr 0.00 ( 0%) sys 0.61 ( 0%) wall 1548 kB ( 0%) ggc bypass jumps : 0.56 ( 0%) usr 0.00 ( 0%) sys 0.59 ( 0%) wall 1401 kB ( 0%) ggc web : 0.43 ( 0%) usr 0.00 ( 0%) sys 0.46 ( 0%) wall 222 kB ( 0%) ggc CSE 2 : 4.20 ( 2%) usr 0.00 ( 0%) sys 4.40 ( 2%) wall 3341 kB ( 0%) ggc branch prediction : 0.93 ( 1%) usr 0.00 ( 0%) sys 0.89 ( 0%) wall 4372 kB ( 0%) ggc flow analysis : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 7 kB ( 0%) ggc combiner : 3.06 ( 2%) usr 0.01 ( 0%) sys 2.96 ( 2%) wall 10796 kB ( 1%) ggc if-conversion : 0.13 ( 0%) usr 0.00 ( 0%) sys 0.23 ( 0%) wall 405 kB ( 0%) ggc regmove : 0.67 ( 0%) usr 0.00 ( 0%) sys 0.62 ( 0%) wall 146 kB ( 0%) ggc local alloc : 1.81 ( 1%) usr 0.00 ( 0%) sys 1.97 ( 1%) wall 3329 kB ( 0%) ggc global alloc : 4.71 ( 3%) usr 0.00 ( 0%) sys 4.84 ( 3%) wall 26430 kB ( 1%) ggc reload CSE regs : 2.71 ( 2%) usr 0.00 ( 0%) sys 2.73 ( 1%) wall 10832 kB ( 1%) ggc flow 2 : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall 2225 kB ( 0%) ggc if-conversion 2 : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall 11 kB ( 0%) ggc peephole 2 : 0.30 ( 0%) usr 0.00 ( 0%) sys 0.30 ( 0%) wall 356 kB ( 0%) ggc rename registers : 1.43 ( 1%) usr 0.00 ( 0%) sys 1.60 ( 1%) wall 2031 kB ( 0%) ggc machine dep reorg : 0.56 ( 0%) usr 0.00 ( 0%) sys 0.68 ( 0%) wall 75 kB ( 0%) ggc reorder blocks : 0.33 ( 0%) usr 0.00 ( 0%) sys 0.30 ( 0%) wall 2611 kB ( 0%) ggc shorten branches : 0.13 ( 0%) usr 0.00 ( 0%) sys 0.19 ( 0%) wall 0 kB ( 0%) ggc reg stack : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.18 ( 0%) wall 1705 kB ( 0%) ggc final : 1.11 ( 1%) usr 0.03 ( 0%) sys 1.12 ( 1%) wall 4552 kB ( 0%) ggc TOTAL : 179.59 7.31 187.62 2049140 kB and for 4.0 + leafify patch: garbage collection : 5.01 ( 4%) usr 0.06 ( 1%) sys 5.91 ( 4%) wall callgraph construction: 0.28 ( 0%) usr 0.00 ( 0%) sys 0.34 ( 0%) wall callgraph optimization: 0.67 ( 0%) usr 0.07 ( 1%) sys 0.85 ( 1%) wall cfg construction : 0.01 ( 0%) usr 0.01 ( 0%) sys 0.05 ( 0%) wall cfg cleanup : 0.94 ( 1%) usr 0.00 ( 0%) sys 1.00 ( 1%) wall trivially dead code : 0.58 ( 0%) usr 0.01 ( 0%) sys 0.62 ( 0%) wall life analysis : 2.53 ( 2%) usr 0.03 ( 1%) sys 3.31 ( 2%) wall life info update : 0.27 ( 0%) usr 0.00 ( 0%) sys 0.32 ( 0%) wall alias analysis : 0.90 ( 1%) usr 0.00 ( 0%) sys 0.95 ( 1%) wall register scan : 0.74 ( 1%) usr 0.02 ( 0%) sys 0.72 ( 0%) wall rebuild jump labels : 0.23 ( 0%) usr 0.00 ( 0%) sys 0.18 ( 0%) wall preprocessing : 0.29 ( 0%) usr 0.18 ( 4%) sys 0.51 ( 0%) wall parser : 4.71 ( 3%) usr 0.65 (13%) sys 6.07 ( 4%) wall name lookup : 2.39 ( 2%) usr 0.71 (14%) sys 3.84 ( 2%) wall integration : 45.27 (33%) usr 0.30 ( 6%) sys 53.22 (32%) wall tree gimplify : 0.65 ( 0%) usr 0.01 ( 0%) sys 0.84 ( 1%) wall tree eh : 0.36 ( 0%) usr 0.01 ( 0%) sys 0.43 ( 0%) wall tree CFG construction : 0.77 ( 1%) usr 0.01 ( 0%) sys 0.98 ( 1%) wall tree CFG cleanup : 1.34 ( 1%) usr 0.01 ( 0%) sys 1.34 ( 1%) wall tree find referenced vars: 0.99 ( 1%) usr 0.02 ( 0%) sys 1.12 ( 1%) wall tree PTA : 1.53 ( 1%) usr 0.01 ( 0%) sys 2.07 ( 1%) wall tree alias analysis : 5.58 ( 4%) usr 0.11 ( 2%) sys 6.57 ( 4%) wall tree PHI insertion : 2.37 ( 2%) usr 0.02 ( 0%) sys 2.91 ( 2%) wall tree SSA rewrite : 2.73 ( 2%) usr 0.03 ( 1%) sys 3.11 ( 2%) wall tree SSA other : 4.80 ( 3%) usr 0.91 (18%) sys 6.47 ( 4%) wall tree operand scan : 3.15 ( 2%) usr 0.93 (19%) sys 4.87 ( 3%) wall dominator optimization: 7.91 ( 6%) usr 0.24 ( 5%) sys 9.26 ( 6%) wall tree SRA : 0.40 ( 0%) usr 0.00 ( 0%) sys 0.52 ( 0%) wall tree CCP : 0.58 ( 0%) usr 0.01 ( 0%) sys 0.64 ( 0%) wall tree split crit edges : 0.14 ( 0%) usr 0.01 ( 0%) sys 0.18 ( 0%) wall tree PRE : 1.95 ( 1%) usr 0.05 ( 1%) sys 2.23 ( 1%) wall tree remove redundant PHIs: 1.37 ( 1%) usr 0.03 ( 1%) sys 1.70 ( 1%) wall tree linearize phis : 0.01 ( 0%) usr 0.01 ( 0%) sys 0.03 ( 0%) wall tree forward propagate: 0.54 ( 0%) usr 0.00 ( 0%) sys 0.67 ( 0%) wall tree conservative DCE : 1.26 ( 1%) usr 0.00 ( 0%) sys 1.56 ( 1%) wall tree aggressive DCE : 0.41 ( 0%) usr 0.00 ( 0%) sys 0.46 ( 0%) wall tree DSE : 0.61 ( 0%) usr 0.00 ( 0%) sys 0.67 ( 0%) wall PHI merge : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall tree loop optimization: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall tree record loop bounds: 0.30 ( 0%) usr 0.01 ( 0%) sys 0.37 ( 0%) wall loop invariant motion : 0.79 ( 1%) usr 0.00 ( 0%) sys 0.86 ( 1%) wall tree canonical iv creation: 0.33 ( 0%) usr 0.00 ( 0%) sys 0.35 ( 0%) wall complete unrolling : 0.78 ( 1%) usr 0.05 ( 1%) sys 1.19 ( 1%) wall tree iv optimization : 1.66 ( 1%) usr 0.05 ( 1%) sys 1.98 ( 1%) wall tree loop init : 0.47 ( 0%) usr 0.01 ( 0%) sys 0.68 ( 0%) wall tree loop fini : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall tree copy headers : 0.74 ( 1%) usr 0.03 ( 1%) sys 0.96 ( 1%) wall tree SSA to normal : 1.31 ( 1%) usr 0.02 ( 0%) sys 1.72 ( 1%) wall tree rename SSA copies: 0.52 ( 0%) usr 0.00 ( 0%) sys 0.63 ( 0%) wall dominance frontiers : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.15 ( 0%) wall control dependences : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall expand : 5.74 ( 4%) usr 0.08 ( 2%) sys 6.98 ( 4%) wall varconst : 0.21 ( 0%) usr 0.00 ( 0%) sys 0.22 ( 0%) wall jump : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall CSE : 3.80 ( 3%) usr 0.02 ( 0%) sys 4.16 ( 2%) wall loop analysis : 0.64 ( 0%) usr 0.03 ( 1%) sys 0.83 ( 0%) wall global CSE : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall CPROP 1 : 0.28 ( 0%) usr 0.00 ( 0%) sys 0.49 ( 0%) wall PRE : 1.01 ( 1%) usr 0.00 ( 0%) sys 1.15 ( 1%) wall CPROP 2 : 0.35 ( 0%) usr 0.00 ( 0%) sys 0.52 ( 0%) wall bypass jumps : 0.38 ( 0%) usr 0.00 ( 0%) sys 0.57 ( 0%) wall CSE 2 : 1.86 ( 1%) usr 0.01 ( 0%) sys 2.13 ( 1%) wall branch prediction : 0.93 ( 1%) usr 0.02 ( 0%) sys 1.01 ( 1%) wall flow analysis : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall combiner : 1.94 ( 1%) usr 0.00 ( 0%) sys 2.21 ( 1%) wall if-conversion : 0.24 ( 0%) usr 0.00 ( 0%) sys 0.29 ( 0%) wall regmove : 0.57 ( 0%) usr 0.02 ( 0%) sys 0.54 ( 0%) wall local alloc : 1.29 ( 1%) usr 0.02 ( 0%) sys 1.54 ( 1%) wall global alloc : 3.15 ( 2%) usr 0.05 ( 1%) sys 3.76 ( 2%) wall reload CSE regs : 1.68 ( 1%) usr 0.01 ( 0%) sys 2.03 ( 1%) wall flow 2 : 0.21 ( 0%) usr 0.00 ( 0%) sys 0.25 ( 0%) wall if-conversion 2 : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.17 ( 0%) wall peephole 2 : 0.31 ( 0%) usr 0.00 ( 0%) sys 0.30 ( 0%) wall rename registers : 0.33 ( 0%) usr 0.00 ( 0%) sys 0.38 ( 0%) wall machine dep reorg : 0.46 ( 0%) usr 0.01 ( 0%) sys 0.47 ( 0%) wall reorder blocks : 0.19 ( 0%) usr 0.00 ( 0%) sys 0.29 ( 0%) wall shorten branches : 0.40 ( 0%) usr 0.00 ( 0%) sys 0.49 ( 0%) wall reg stack : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall final : 0.59 ( 0%) usr 0.04 ( 1%) sys 0.73 ( 0%) wall symout : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall rest of compilation : 0.38 ( 0%) usr 0.01 ( 0%) sys 0.55 ( 0%) wall TOTAL : 138.55 4.95 167.96 which I think is a fair comparison because of equal runtime performance and possibly similar inlining (non-leafified parts may be still differently inlined). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23955