> On Fri, 11 Jul 2014, Jan Hubicka wrote: > > > Hi, > > since we both agreed offlining constructors from global decl stream is a > > good > > idea, I went ahead and implemented it. I would like to followup by an > > cleanups; for example the sections are still tagged as function sections, > > but I > > would like to do it incrementally. There is quite some uglyness in the way > > we > > handle function sections and the patch started to snowball very quickly. > > > > The patch conceptually copies what we do for functions and re-uses most of > > infrastructure. varpool_get_constructor is cgraph_get_body (i.e. mean of > > getting function in) and it is used by output machinery, by ipa-visibility > > while rewritting the constructor and by ctor_for_folding (which makes us to > > load the ctor whenever it is needed by ipa-cp or ipa-devirt). > > > > I kept get_symbol_initial_value as an authority to decide if we want to > > encode > > given constructor or not. The section itself for trivial ctor is about 25 > > bytes and with header it is probably close to double of it. Currently the > > heuristic > > is to offline only constructors that are CONSTRUCTOR and keep simple > > expressions > > inline. We may want to tweak it. > > Hmm, so what about artificial testcase with gazillions of > > struct X { int i; }; > > struct X a0001 = { 1 }; > struct X a0002 = { 2 }; > .... > > how does it explode LTO IL size and streaming time (compile-out and > LTRANS in)? I suppose it still helps WPA stage.
Well, nothing really artificial, except that gazzilions of static variables called a0001 to a000gazzilion are ugly :)) I just put the CONSRUCTOR bits in the initial varsion to not have the path unused at all. Either we can base our decision on size of the variable or do simple walk to see if it needs more than, say 8 trees. I will play with this incrementally after cleaning up the headers (as those accounts for the overhead) > > Also what we desparately miss is to put CONST_DECLs into the symbol > table (and thus eventually move the constant pool to symtab). That > and no longer allowing STRING_CSTs in the IL but only CONST_DECLs > with STRING_CST initializers (to fix PR50199). Yep, I have patch for putting CONST_DECLs into symbol table. It however does not help partitionability because at the moment output machinery do not expect const decls to have visibilities. I will push out that change (and LABEL_DECL, too) after Martin's renaming patches lands to mainline. > > > The patch does not bring miraculous savings to firefox WPA, but it does > > some: > > > > GGC memory after global stream is read goes from 1376898k to 1250533k > > overall GGC allocations from 4156478 kB to 4012462 kB > > read 11006599 SCCs of average size 1.907692 -> read 9119433 SCCs of average > > size 2.037867 > > 20997206 tree bodies read in total -> 18584194 tree bodies read in total > > Size of mmap'd section decls: 299540188 bytes -> Size of mmap'd section > > decls: 271557265 bytes > > Size of mmap'd section function_body: 5711078 bytes -> Size of mmap'd > > section function_body: 7548680 bytes > > > > Things would be better if ipa-visibility and ipa-devirt did not load most of > > the virtual tables into memory (still better than loading each into memory > > 20 > > times at average). I will work on that incrementally. We load 10311 ctors > > into > > memory at WPA time. > > > > Note that firefox seems to feature really huge data segment these days. > > http://hubicka.blogspot.ca/2014/04/linktime-optimization-in-gcc-2-firefox.html > > > > Bootstrapped/regtested x86_64-linux, tested with firefox, lto bootstrap > > in progress, OK? > > The patch looks ok to me. How about simply doing > s/LTO_section_function_body/LTO_section_symbol_content/ instead of > adding LTO_section_variable_initializer? Yeah, I was thinking about it, too. I think variable and constructor sections may differ in its header however, since we do not need CFG stream for variables. Thanks! Honza > > Thanks, > Richard.