On 2014.09.27 at 01:27 +0200, Jan Hubicka wrote:
> > While a plain Firefox -flto build works fine. LTO/PGO build fails with:
> > 
> > lto1: internal compiler error: in ipa_merge_profiles, at ipa-utils.c:540
> > 0x7d6165 ipa_merge_profiles(cgraph_node*, cgraph_node*)
> >         ../../gcc/gcc/ipa-utils.c:540
> > 0xf10c41 ipa_icf::sem_function::merge(ipa_icf::sem_item*)
> >         ../../gcc/gcc/ipa-icf.c:753
> > 0xf15206 ipa_icf::sem_item_optimizer::merge_classes(unsigned int)
> >         ../../gcc/gcc/ipa-icf.c:2706
> > 0xf1c1f4 ipa_icf::sem_item_optimizer::execute()
> >         ../../gcc/gcc/ipa-icf.c:2098
> > 0xf1d3f1 ipa_icf_driver
> >         ../../gcc/gcc/ipa-icf.c:2784
> > 0xf1d3f1 ipa_icf::pass_ipa_icf::execute(function*)
> >         ../../gcc/gcc/ipa-icf.c:2831
> > 
> > 
> > The pass is also very memory hungry (from 3GB without ICF to 4GB during
> > libxul link), while the code size savings are in the 1% range.
> 
> Thnks for checking. I was just thinking about doing that myself.  Would
> you mind posting -ftime-report of firefox WPA stage?

(without ICF)
Execution times (seconds)
 phase setup             :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall  
  1412 kB ( 0%) ggc
 phase opt and generate  :  58.38 (63%) usr   2.00 (47%) sys  60.37 (40%) wall  
403069 kB (12%) ggc
 phase stream in         :  30.24 (33%) usr   0.97 (23%) sys  33.90 (22%) wall 
2944210 kB (88%) ggc
 phase stream out        :   4.29 ( 5%) usr   1.32 (31%) sys  57.32 (38%) wall  
     0 kB ( 0%) ggc
 phase finalize          :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall  
     0 kB ( 0%) ggc
 garbage collection      :   3.68 ( 4%) usr   0.00 ( 0%) sys   3.68 ( 2%) wall  
     0 kB ( 0%) ggc
 callgraph optimization  :   0.50 ( 1%) usr   0.00 ( 0%) sys   0.50 ( 0%) wall  
   166 kB ( 0%) ggc
 ipa dead code removal   :   6.91 ( 7%) usr   0.08 ( 2%) sys   7.25 ( 5%) wall  
     0 kB ( 0%) ggc
 ipa virtual call target :   7.08 ( 8%) usr   0.04 ( 1%) sys   6.93 ( 5%) wall  
     0 kB ( 0%) ggc
 ipa devirtualization    :   0.27 ( 0%) usr   0.00 ( 0%) sys   0.27 ( 0%) wall  
 10365 kB ( 0%) ggc
 ipa cp                  :   1.81 ( 2%) usr   0.06 ( 1%) sys   3.40 ( 2%) wall  
173701 kB ( 5%) ggc
 ipa inlining heuristics :  16.60 (18%) usr   0.27 ( 6%) sys  17.48 (12%) wall  
532704 kB (16%) ggc
 ipa comdats             :   0.19 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%) wall  
     0 kB ( 0%) ggc
 ipa lto gimple out      :   0.21 ( 0%) usr   0.04 ( 1%) sys   0.97 ( 1%) wall  
     0 kB ( 0%) ggc
 ipa lto decl in         :  18.29 (20%) usr   0.54 (13%) sys  18.96 (12%) wall 
2226088 kB (66%) ggc
 ipa lto decl out        :   3.93 ( 4%) usr   0.13 ( 3%) sys   4.06 ( 3%) wall  
     0 kB ( 0%) ggc
 ipa lto constructors in :   0.24 ( 0%) usr   0.03 ( 1%) sys   0.59 ( 0%) wall  
 14226 kB ( 0%) ggc
 ipa lto constructors out:   0.08 ( 0%) usr   0.04 ( 1%) sys   0.15 ( 0%) wall  
     0 kB ( 0%) ggc
 ipa lto cgraph I/O      :   0.89 ( 1%) usr   0.12 ( 3%) sys   1.02 ( 1%) wall  
364151 kB (11%) ggc
 ipa lto decl merge      :   2.14 ( 2%) usr   0.01 ( 0%) sys   2.14 ( 1%) wall  
  8196 kB ( 0%) ggc
 ipa lto cgraph merge    :   1.59 ( 2%) usr   0.00 ( 0%) sys   1.60 ( 1%) wall  
 12716 kB ( 0%) ggc
 whopr wpa               :   1.54 ( 2%) usr   0.03 ( 1%) sys   1.55 ( 1%) wall  
     1 kB ( 0%) ggc
 whopr wpa I/O           :   0.04 ( 0%) usr   1.11 (26%) sys  52.10 (34%) wall  
     0 kB ( 0%) ggc
 whopr partitioning      :   5.02 ( 5%) usr   0.01 ( 0%) sys   5.03 ( 3%) wall  
  4938 kB ( 0%) ggc
 ipa reference           :   2.04 ( 2%) usr   0.02 ( 0%) sys   2.08 ( 1%) wall  
     0 kB ( 0%) ggc
 ipa profile             :   0.32 ( 0%) usr   0.00 ( 0%) sys   0.33 ( 0%) wall  
     0 kB ( 0%) ggc
 ipa pure const          :   2.43 ( 3%) usr   0.02 ( 0%) sys   2.49 ( 2%) wall  
     0 kB ( 0%) ggc
 tree STMT verifier      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall  
     0 kB ( 0%) ggc
 callgraph verifier      :  16.31 (18%) usr   1.69 (39%) sys  17.96 (12%) wall  
     0 kB ( 0%) ggc
 dominance computation   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall  
     0 kB ( 0%) ggc
 varconst                :   0.01 ( 0%) usr   0.03 ( 1%) sys   0.05 ( 0%) wall  
     0 kB ( 0%) ggc
 unaccounted todo        :   0.69 ( 1%) usr   0.00 ( 0%) sys   0.69 ( 0%) wall  
     0 kB ( 0%) ggc
 TOTAL                 :  92.91             4.29           151.73            
3348693 kB
Extra diagnostic checks enabled; compiler may run slowly.
Configure with --enable-checking=release to disable checks.

(with ICF)
Execution times (seconds)
 phase setup             :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall  
  1412 kB ( 0%) ggc
 phase opt and generate  :  82.70 (70%) usr   3.31 (53%) sys  86.17 (45%) wall 
1468975 kB (33%) ggc
 phase stream in         :  30.46 (26%) usr   1.02 (16%) sys  31.48 (16%) wall 
2944210 kB (67%) ggc
 phase stream out        :   4.52 ( 4%) usr   1.90 (30%) sys  73.47 (38%) wall  
    12 kB ( 0%) ggc
 phase finalize          :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall  
     0 kB ( 0%) ggc
 garbage collection      :   7.01 ( 6%) usr   0.00 ( 0%) sys   6.99 ( 4%) wall  
     0 kB ( 0%) ggc
 callgraph optimization  :   0.49 ( 0%) usr   0.00 ( 0%) sys   0.50 ( 0%) wall  
   166 kB ( 0%) ggc
 ipa dead code removal   :   6.98 ( 6%) usr   0.13 ( 2%) sys   6.89 ( 4%) wall  
     0 kB ( 0%) ggc
 ipa virtual call target :   6.93 ( 6%) usr   0.03 ( 0%) sys   7.20 ( 4%) wall  
     6 kB ( 0%) ggc
 ipa devirtualization    :   0.27 ( 0%) usr   0.00 ( 0%) sys   0.18 ( 0%) wall  
 10365 kB ( 0%) ggc
 ipa cp                  :   1.87 ( 2%) usr   0.11 ( 2%) sys   2.00 ( 1%) wall  
167204 kB ( 4%) ggc
 ipa inlining heuristics :  17.15 (15%) usr   0.21 ( 3%) sys  17.35 ( 9%) wall  
512636 kB (12%) ggc
 ipa comdats             :   0.19 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%) wall  
     0 kB ( 0%) ggc
 ipa lto gimple in       :   5.17 ( 4%) usr   1.04 (17%) sys   6.51 ( 3%) wall  
855058 kB (19%) ggc
 ipa lto gimple out      :   0.38 ( 0%) usr   0.08 ( 1%) sys   3.07 ( 2%) wall  
    12 kB ( 0%) ggc
 ipa lto decl in         :  18.38 (16%) usr   0.56 ( 9%) sys  18.95 (10%) wall 
2226088 kB (50%) ggc
 ipa lto decl out        :   3.95 ( 3%) usr   0.08 ( 1%) sys   4.03 ( 2%) wall  
     0 kB ( 0%) ggc
 ipa lto constructors in :   0.29 ( 0%) usr   0.01 ( 0%) sys   0.29 ( 0%) wall  
 14389 kB ( 0%) ggc
 ipa lto constructors out:   0.10 ( 0%) usr   0.03 ( 0%) sys   0.58 ( 0%) wall  
     0 kB ( 0%) ggc
 ipa lto cgraph I/O      :   0.91 ( 1%) usr   0.10 ( 2%) sys   1.02 ( 1%) wall  
364151 kB ( 8%) ggc
 ipa lto decl merge      :   2.14 ( 2%) usr   0.00 ( 0%) sys   2.14 ( 1%) wall  
  8196 kB ( 0%) ggc
 ipa lto cgraph merge    :   1.65 ( 1%) usr   0.01 ( 0%) sys   1.66 ( 1%) wall  
 12716 kB ( 0%) ggc
 whopr wpa               :   1.81 ( 2%) usr   0.01 ( 0%) sys   1.85 ( 1%) wall  
     1 kB ( 0%) ggc
 whopr wpa I/O           :   0.05 ( 0%) usr   1.71 (27%) sys  65.75 (34%) wall  
     0 kB ( 0%) ggc
 whopr partitioning      :   5.05 ( 4%) usr   0.00 ( 0%) sys   5.06 ( 3%) wall  
  5012 kB ( 0%) ggc
 ipa reference           :   2.13 ( 2%) usr   0.03 ( 0%) sys   2.16 ( 1%) wall  
     0 kB ( 0%) ggc
 ipa profile             :   0.32 ( 0%) usr   0.01 ( 0%) sys   0.33 ( 0%) wall  
     0 kB ( 0%) ggc
 ipa pure const          :   2.57 ( 2%) usr   0.00 ( 0%) sys   2.56 ( 1%) wall  
     0 kB ( 0%) ggc
 ipa icf                 :   6.88 ( 6%) usr   0.08 ( 1%) sys   7.01 ( 4%) wall  
   855 kB ( 0%) ggc
 tree SSA rewrite        :   0.23 ( 0%) usr   0.06 ( 1%) sys   0.28 ( 0%) wall  
 33946 kB ( 1%) ggc
 tree SSA incremental    :   0.42 ( 0%) usr   0.05 ( 1%) sys   0.53 ( 0%) wall  
 21099 kB ( 0%) ggc
 tree operand scan       :   0.47 ( 0%) usr   0.08 ( 1%) sys   0.34 ( 0%) wall  
181275 kB ( 4%) ggc
 tree STMT verifier      :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall  
     0 kB ( 0%) ggc
 callgraph verifier      :  22.76 (19%) usr   1.68 (27%) sys  24.44 (13%) wall  
     0 kB ( 0%) ggc
 dominance frontiers     :   0.02 ( 0%) usr   0.01 ( 0%) sys   0.04 ( 0%) wall  
     0 kB ( 0%) ggc
 dominance computation   :   0.19 ( 0%) usr   0.05 ( 1%) sys   0.25 ( 0%) wall  
     0 kB ( 0%) ggc
 varconst                :   0.04 ( 0%) usr   0.01 ( 0%) sys   0.05 ( 0%) wall  
     0 kB ( 0%) ggc
 loop fini               :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall  
     0 kB ( 0%) ggc
 unaccounted todo        :   0.82 ( 1%) usr   0.00 ( 0%) sys   0.81 ( 0%) wall  
     0 kB ( 0%) ggc
 TOTAL                 : 117.68             6.23           191.15            
4414612 kB
Extra diagnostic checks enabled; compiler may run slowly.
Configure with --enable-checking=release to disable checks.

> It seems that in this case we reject too many of equality candidates?
> It think the original numbers was about 4-5% but later some equivalences was
> disabled because of devirt/aliasing issues. Do you compare it with gold ICF
> enabled? There are quite few obvious improvements to the analysis that can
> be done, but I guess we need to analyze the interesting cases one by one.

Gold ICF was enabled (-Wl,--icf=all,--icf-iterations=3).

-- 
Markus

Reply via email to