On Mon, 29 Apr 2019, Martin Liška wrote: > On 9/10/18 1:43 PM, Martin Liška wrote: > > On 09/04/2018 05:07 PM, Martin Liška wrote: > >> - in order to achieve real speed up we need to split also other generated > >> (and also dwarf2out.c, i386.c, ..) files: > >> here I'm most concerned about insn-recog.c, which can't be split the same > >> way without ending up with a single huge SCC component. > > > > About the insn-recog.c file: all functions are static and using SCC one ends > > up with all functions in one component. In order to split the callgraph one > > needs to promote some functions to be extern and then split would be > > possible. > > In order to do that we'll probably need to teach splitter how to do > > partitioning > > based on minimal number of edges to be removed. > > > > I need to inspire in lto_balanced_map, or is there some simple algorithm I > > can start with? > > > > Martin > > > > I'm adding here Richard Sandiford as he wrote majority of gcc/genrecog.c file. > As mentioned, I'm seeking for a way how to split the generated file. Or how > to learn the generator to process a reasonable splitting.
Somewhen earlier this year I've done the experiment with using a compile with -flto -fno-fat-lto-objects and a link via -flto -r -flinker-output=rel into the object file. This cut compile-time more than in half with less maintainance overhead. Adding other files to this handling looks trivial as well, as well as conditionalizing it (I'd probably not want this for devel builds). It might be interesting to optimize this a bit as well by somehow merging the compile and WPA stages, thus special-casing single TU WPA input in a similar way as we (still) have -flto-partition=none. That said, re-doing the measurement is probably interesting (as applying to other cases like insn-recog.c). Richard.