On Fri, 11 Jul 2014, Jan Hubicka wrote: > > On Fri, 11 Jul 2014, Jan Hubicka wrote: > > > > > Hi, > > > since we both agreed offlining constructors from global decl stream is a > > > good > > > idea, I went ahead and implemented it. I would like to followup by an > > > cleanups; for example the sections are still tagged as function sections, > > > but I > > > would like to do it incrementally. There is quite some uglyness in the > > > way we > > > handle function sections and the patch started to snowball very quickly. > > > > > > The patch conceptually copies what we do for functions and re-uses most of > > > infrastructure. varpool_get_constructor is cgraph_get_body (i.e. mean of > > > getting function in) and it is used by output machinery, by ipa-visibility > > > while rewritting the constructor and by ctor_for_folding (which makes us > > > to > > > load the ctor whenever it is needed by ipa-cp or ipa-devirt). > > > > > > I kept get_symbol_initial_value as an authority to decide if we want to > > > encode > > > given constructor or not. The section itself for trivial ctor is about 25 > > > bytes and with header it is probably close to double of it. Currently the > > > heuristic > > > is to offline only constructors that are CONSTRUCTOR and keep simple > > > expressions > > > inline. We may want to tweak it. > > > > Hmm, so what about artificial testcase with gazillions of > > > > struct X { int i; }; > > > > struct X a0001 = { 1 }; > > struct X a0002 = { 2 }; > > .... > > > > how does it explode LTO IL size and streaming time (compile-out and > > LTRANS in)? I suppose it still helps WPA stage. > > Well, nothing really artificial, except that gazzilions of static variables > called a0001 to a000gazzilion are ugly :)) > > I just put the CONSRUCTOR bits in the initial varsion to not have the path > unused > at all. Either we can base our decision on size of the variable or do simple > walk to see if it needs more than, say 8 trees.
Hum, probably not worth special-casing. > I will play with this incrementally after cleaning up the headers (as those > accounts for the overhead) > > > > Also what we desparately miss is to put CONST_DECLs into the symbol > > table (and thus eventually move the constant pool to symtab). That > > and no longer allowing STRING_CSTs in the IL but only CONST_DECLs > > with STRING_CST initializers (to fix PR50199). > > Yep, I have patch for putting CONST_DECLs into symbol table. It however > does not help partitionability because at the moment output machinery do > not expect const decls to have visibilities. Well, just make them regular (anonymous) VAR_DECLs then ... (the fact that a CONST_DECL is anonymous is probably the only real difference - and that they are mergeable by content). > I will push out that change (and LABEL_DECL, too) after Martin's renaming > patches lands to mainline. Thanks. > > > > > The patch does not bring miraculous savings to firefox WPA, but it does > > > some: > > > > > > GGC memory after global stream is read goes from 1376898k to 1250533k > > > overall GGC allocations from 4156478 kB to 4012462 kB > > > read 11006599 SCCs of average size 1.907692 -> read 9119433 SCCs of > > > average size 2.037867 > > > 20997206 tree bodies read in total -> 18584194 tree bodies read in total > > > Size of mmap'd section decls: 299540188 bytes -> Size of mmap'd section > > > decls: 271557265 bytes > > > Size of mmap'd section function_body: 5711078 bytes -> Size of mmap'd > > > section function_body: 7548680 bytes > > > > > > Things would be better if ipa-visibility and ipa-devirt did not load most > > > of > > > the virtual tables into memory (still better than loading each into > > > memory 20 > > > times at average). I will work on that incrementally. We load 10311 > > > ctors into > > > memory at WPA time. > > > > > > Note that firefox seems to feature really huge data segment these days. > > > http://hubicka.blogspot.ca/2014/04/linktime-optimization-in-gcc-2-firefox.html > > > > > > Bootstrapped/regtested x86_64-linux, tested with firefox, lto bootstrap > > > in progress, OK? > > > > The patch looks ok to me. How about simply doing > > s/LTO_section_function_body/LTO_section_symbol_content/ instead of > > adding LTO_section_variable_initializer? > > Yeah, I was thinking about it, too. > I think variable and constructor sections may differ in its header however, > since we do > not need CFG stream for variables. > > Thanks! > Honza > > > > Thanks, > > Richard. > > -- Richard Biener <rguent...@suse.de> SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer